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PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS 
OF PREPARING PLANT ARTIFICIAL CHROMOSOMES 

RELATED APPLICATIONS 

Benefit of priority is claimed to U.S. Provisional Application No. 
5 60/294,687, filed May 30, 2001, by CARL PEREZ AND STEVEN 

FABIJANSKI entitled PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF 
AND METHODS FOR PREPARING PLANT ARTIFICIAL CHROMOSOMES and 
to U.S. Provisional Application No. 60/296,329, filed June 4, 2001, by CARL 
PEREZ AND STEVEN FABIJANSKI entitled PLANT ARTIFICIAL 

10 CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING PLANT 
ARTIFICIAL CHROMOSOMES. This application is related to U.S. Provisional 
Application No. 60/294,758, filed May 30, 2001 , by EDWARD PERKINS et 
a/., entitled CHROMOSOME-BASED PLATFORMS and to U.S. Provisional 
Application No. 60/366,891, filed March 21, 2002, by by EDWARD 

15 PERKINS eta/., entitled CHROMOSOME-BASED PLATFORMS. This 

application is also related to U.S. Provisional Application Attorney Docket 
No. 24601-420, filed May 30, 2002, by EDWARD PERKINS eta/., entitled 
CHROMOSOME-BASED PLA TFORMS and to PCT International Patent 
Application Attorney Docket No. 24601-420PC, filed May 30, 2002, by 

20 EDWARD PERKINS eta/., entitled CHROMOSOME-BASED PLATFORMS. 
This application is related to U.S. application Serial No. 08/695,191, filed 
August 7, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES, now U.S. Patent No. 6,025,155. 

25 This application is also related to U.S. application Serial No. 08/682,080, 
filed July 15, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES, now U.S. Patent No. 6,077,697. 
This application is also related U.S. application Serial No. 08/629,822, filed 

30 April 10, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
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ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES (now abandoned), and is also 
related to copending U.S. application Serial No. 09/096,648, filed June 12, 
1998, by GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL 
5 CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING 

ARTIFICIAL CHROMOSOMES and to U.S. application Serial No. 09/835,682, 
April 10, 1997 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES (now abandoned). This 

10 application is also related to copending U.S. application Serial No. 
09/724,726, filed November 28, 2000, U.S. application Serial No. 
09/724,872, filed November 28, 2000, U.S. application Serial No. 
09/724,693, filed November 28, 2000, U.S. application Serial No. 
09/799,462, filed March 5, 2001, U.S. application Serial No. 09/836,911, 

15 filed April 17, 2001, and U.S. application Serial No. 10/125,767, filed April 
17, 2002, each of which is by GYULA HADLACZKY and ALADAR SZALAY, 
and is entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND 
METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES. This application 
is also related to International PCT application No. WO 97/40183. Where 

20 permitted the subject matter of each of these applications is incorporated by 
reference in its entirety. 
FIELD OF THE INVENTION 

Artificial chromosomes and methods of producing artificial 
chromosomes, particularly for use in delivery of nucleic acids and expression 

25 thereof in plants are provided. Also provided are methods of use of artificial 
chromosomes in the delivery of nucleic acids to host cells, including plant 
cells, and the expression of the nucleic acids therein. The resulting plant 
cells, tissues, organs and whole plants containing the artificial chromosomes, 
plant cell-based methods for production of heterologous proteins and 
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methods of producing transgenic organisms, particularly plants, using the 
artificial chromosomes are provided. 
BACKGROUND OF THE INVENTION 

The stable transfer of nucleic acids into plant cells and the expression 
5 of the nucleic acids therein poses many challenges. Many efforts at the 
stable introduction of nucleic acids into plant cells have utilized 
Agrobacterium-med'mted transformation. Agrobacterium is a free-living 
Gram-negative soil bacterium. Virulent strains of this bacterium are able to 
infect plant tissue and induce the production of a neoplastic growth 

10 commonly referred to as a crowngall. Virulent strains of Agrobacterium 
contain a large plasmid DNA known as a Ti-plasmid that contains genes 
required for DNA transfer {vir genes) and replication as well as a region of 
DNA that is transferred to plant cells called T-DNA. The T-DNA region is 
bordered by T-DNA border sequences that are crucial to the DNA transfer 

15 process. These T-DNA border sequences are recognized by the vir genes 
encoded on the Ti-plasmid and the vir genes are responsible for the DNA 
transfer process. 

Most wild-type Agrobacterium have a relatively broad dicot plant host 
range and are capable of transferring T-DNA regions up to 25 kilobases of 

20 DNA (e.g., nopaline strains) or more [e.g., octopine strains). Accordingly, 
numerous methods of using Agrobacterium to transfer DNA into plant cells 
have been developed based on the engineering of the Ti-plasmid to no longer 
contain the genes responsible for altered morphology and replacing these 
genes with a recombinant gene encoding a trait of interest. There are two 

25 primary types of Agrobacterium-based plant transformation systems, binary 
[see, e.g., U.S. Patent No. 4,940,838] and co-integrate [see, e.g., Fraley et 
a/. (1985) Biotechnology 3:629-635] methods. The T-DNA border repeats 
are maintained in both systems and the natural DNA transfer process is used 
to transfer the portion of DNA located between the T-DNA borders into the 

30 plant cell. 
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Another plant cell transformation system, termed biolistics, involves 
the bombardment of plant cells with microscopic particles coated with DNA 
encoding a new trait. The particles are rapidly accelerated, typically by gas 
or electrical discharge, through the cell wall and membranes, whereby the 
5 DNA is released into the cell and is incorporated into the genome of the cell. 
This method is used for transformation of many crops, including corn, wheat, 
barley, rice, woody tree species and others. 

A significant number of crop species of commercial interest have been 
transformed using either Agrobacterfum-rc\ed'\ated or biolistic systems. 

10 However, these methods have many limitations that limit their utility. For 
example, there are limits to the size of the heterologous DNA that can be 
transferred using these methods; typically, only one to two genes may be 
transferred. Thus, although these methods may have utility in producing 
crop products modified to contain a single new trait, such as insect or 

15 herbicide tolerance, they may not be sufficient to transfer DNA that will 
provide for multiple traits, or very large DNA segments encoding a 
multiplicity of traits. 

In addition, the genetically modified plant cells produced by these 
methods tend to contain the transferred DNA in euchromatic regions of the 

20 genomic DNA. Typically, a large number of independent transgenic insertion 
events must be screened before a suitable event (such as insertion of a gene 
into the host genomic DNA such that it provides a sufficient level of gene 
expression within temporal and spatial expectations and without evidence of 
gene rearrangement) is identified. 

25 Another limitation of these methods is the effort required to utilize 

them in the genetic modification of many commercially important crops. For 
example, transformation efficiency can vary with the crop and can be low, 
notably in cereal crops such as corn and wheat. Often the inserted genes 
are rearranged and unstable over generations. 
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Furthermore Agrobacterium tumefaciens relies on host-parasite 
interaction in order to be successful. This has the effect that Agrobacterium 
has a preference for some dicots, while other dicots, monocots and conifers 
are resistant to transformation via Agrobacterium. Self -replicating vectors 
5 have also been used in the transfer of nucleic acids into plant cells. Such 
episomal vectors contain DNA sequences that are required for DNA 
replication and sustainability of the vector in a living cell. In higher plants, 
very few episomal vectors have been developed. These episomal vectors 
have the drawback of having a very limited capacity for carrying genetic 

10 information and are unstable. One example of an episomal plant vector is 
the Cauliflower Mosaic Virus [Brisson eta/. (1984) Nature 3/0:511]. 

Limitations of these gene delivery technologies necessitate the 
development of alternative vector systems suitable for transferring large (up 
to Mb size or larger) genes, gene complexes, and multiple genes together 

1 5 with regulatory elements for safe, controlled, and persistent expression of 
the desired genetic material in higher organisms, particularly plants, without 
rearrangement caused by insertion or mutagenesis. Therefore, it is an object 
herein to provide artificial chromosomes for the introduction of large nucleic 
acids into eukaryotic cells and methods using the artificial chromosomes, 

20 particularly for the introduction and expression of nucleic acids in plants. 
SUMMARY OF THE INVENTION 

Provided herein are plant artificial chromosomes and methods for 
producing plant artificial chromosomes. The artificial chromosomes are fully 
functional stable chromosomes. Plant artificial chromosomes provided herein 

25 have a particular composition that makes them ideal vectors for stable, 

controlled, high-level expression of heterologous nucleic acids in plant cells. 
The artificial chromosomes are capable of independent, extra-genomic 
maintenance, replication and segregation within cells and can carry multiple, 
large heterologous genes. 
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Artificial plant chromosomes provided herein are non-natural 
chromosomes that exhibit an ordered segmentation that distinguishes them 
from naturally occurring chromosomes. The segmented appearance can be 
visualized using a variety of chromosome analysis techniques and correlates 
5 with the unique structure of these artificial chromosomes, which, in 
particular methods of producing these chromosomes, can arise through 
amplification of chromosomal segments (i.e., amplification-based artificial 
chromosomes). The artificial chromosomes, throughout the region or regions 
of segmentation, are predominantly made up of one or more nucleic acid 

10 units that is (are) repeated in the region (referred to as the repeat region) and 
that have a similar gross structure. Repeats of a nucleic acid unit tend to be 
of similar size and share some common nucleic acid sequences, for example, 
a replication site involved in amplification of chromosome segments and/or 
some heterologous nucleic acid. Although the size of a repeating nucleic 

15 acid unit can vary, typically they tend to be greater than about 100 kb, 
greater than about 500 kb, greater than about 1 Mb, greater than about 5 
Mb or greater than about 10 Mb. Typically, repeats of a nucleic acid unit are 
substantially similar in nucleic acid composition and can be nearly identical. 
The common nucleic acid sequences can contain sequences that represent 

20 euchromatic and heterochromatic nucleic acid. The composition of the 

amplification-based artificial chromosomes can be such that substantially the 
entire chromosome exhibits a segmented appearance or such that only one 
or more portions that make-up less than the entire chromosome appear 
segmented. 

25 The composition of the plant artificial chromosomes provided herein 

can vary. For example, in some of the artificial chromosomes provided 
herein, the repeat region or regions can be made up predominantly of 
heterochromatic DNA (i.e., the repeat region or regions contain more 
heterochromatic DNA than other types of DNA, e.g., euchromatic DNA). In 

30 other artificial chromosomes provided herein, the repeat region or regions can 
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be made up predominantly of euchromatic DNA (i.e., the repeat region or 
regions contain more euchromatic DNA than other types of DNA 7 e.g., 
heterochromatic DNA) or can be made up of substantially equivalent 
amounts of heterochromatic and euchromatic DNA, e.g., about 40% to 
5 about 50% of one type of nucleic acid and about 50% to about 60% of the 
other type of nucleic acid. The repeat region or regions thus can be entirely 
heterochromatic (while still containing one or more heterologous genes), or 
can contain increasing amounts of euchromatic DNA, such that, for example, 
the region contains about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 

10 90% or greater than 90% euchromatic DNA. Common nucleic acid 

sequences within repeated nucleic acid units in a repeat region can contain 
DNA that represents euchromatic nucleic acid and DNA that represents 
heterochromatic nucleic acid. Because the entire artificial chromosome can 
be made up predominantly of a repeat region or regions (e.g., the 

1 5 composition of the chromosome is such that the repeat region or regions 
make up greater than about 50% or greater than about 60% of the 
chromosome), it is thus possible for the artificial chromosome to be made up 
predominantly of heterochromatin or euchromatin, or to be made up of 
substantially equivalent amounts of heterochromatin and euchromatin, e.g., 

20 about 40% to about 50% of one type of nucleic acid and about 50% to 
about 60% of the other type of nucleic acid. Plant artificial chromosomes 
provided herein can be isolated or contained within cells or vesicles. 

Also provided herein are cells containing plant artificial chromosomes 
as described herein, including plant cells and animal cells. Included among 

25 the cells containing the plant artificial chromosomes are any cells that include 
one or more plant chromosomes. Included, for example, are plant cells, 
including plant protoplasts, in culture and within plant tissues, organs, seeds, 
pollen or whole plants. Plant cells containing the plant artificial 
chromosomes can be from any type of plant, including monocots and dicots. 

30 For example, the plant cells can be from Arabidopsis, Nicotiana, Solatium, 
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Lycopersicon , Daucus, Hordeum, Zea mays, Brassfca, Triticum, Helianthus, 
Oryza, Glycine (soybean), gossypium (cotton). Also contemplated are 
mammalian and other animal cells that contain plant ACs 

Plant cells containing artificial chromosomes of any species are also 
5 provided herein. Thus, for example, such plant cells can contain an artificial 
chromosome containing an animal, e.g., mammalian, centromere or an insect 
or avian centromere. Included among the artificial chromosomes contained 
within plant cells as provided herein are predominantly heterochromatic 
[formerly referred to as satellite artificial chromosomes (SATACs); see, e.g., 

10 U.S. Patent Nos. 6,077,697 and 6,025,155 and published International PCT 
application No. WO 97/40183], minichromosomes which contain a de novo 
centromere, artificial chromosomes containing one or more regions of 
repeating nucleic acid units wherein the repeat region(s) contain substantially 
equivalent amounts of euchromatic and heterochromatic nucleic acid and in 

15 vitro assembled artificial chromosomes, each from any species. An 
exemplary artificial chromosome is a mammalian satellite artificial 
chromosome containing a mouse centromere. Included among the plant cells 
containing artificial chromosomes of any species are plant ceils, including 
plant protoplasts, in culture and within plant tissues, organs, seeds, pollen or 

20 whole plants. Plant cells containing the artificial chromosomes can be from 
any type of plant, including monocots and dicots. For example, the plant 
cells can be from Arabidopsis, Nicotiana, Solanum, Lycopersicon, Daucus, 
Hordeum, Zea mays, Brassica, Triticum, Helianthus and Oryza. 

Further provided herein are methods of producing plant artificial 

25 chromosomes. One embodiment of these methods includes the steps of 
introducing nucleic acid into a cell containing plant chromosomes and 
selecting a cell containing an artificial chromosome that contains one or more 
repeat regions in which one or more nucleic acid units is (are) repeated. The 
repeats of a nucleic acid unit in a repeat region can contain common nucleic 

30 acid sequences and can be substantially identical. In some embodiments of 
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this method, the repeat region(s) of the artificial chromosome contain 
substantially equivalent amounts of euchromatic and heterochromatic nucleic 
acid. The artificial chromosome can be predominantly made up of one or 
more repeat regions. In further embodiments of this method, the artificial 
5 chromosome is made up of substantially equivalent amounts of euchromatic 
and heterochromatic nucleic acid. In further embodiments of this method, 
the repeats of a nucleic acid unit have common nucleic acid sequences 
which contain sequences that represent euchromatic and heterochromatic 
nucleic acid. 

10 Any cell containing plant chromosomes can be used in these 

embodiments of methods of producing plant artificial chromosomes described 
herein. For example, the cell can be any cell that contains chromosomes 
from Arabidopsis, tobacco, Solanum, Lycopersicon, Daucus, Hordeum, Zea 
mays, Brassica, Triticum, Oryza, Capsicum, lentil and/or Helianthus, including 

1 5 cells or protoplasts of Arabidopsis, tobacco and/or Helianthus. 

The nucleic acid that is introduced into a cell containing plant 
chromosomes in methods of producing a plant artificial chromosome as 
provided herein can be any nucleic acid, including, but not limited to, satellite 
DNA, rDNA and lambda phage DNA. Satellite DMA and rDMA includes such 

20 DNA from plants, such as, for example, Arabidopsis, Nicotiana, Soianum, 
Lycopersicon, Daucus, Hordeum, Zea mays, Brassica, Triticum and Oryza, 
and from animals, such as mammals. The rDNA can contain sequences of 
an intergenic spacer region, such as can be obtained, for example, from DNA 
of Arabidopsis, Soianum, Lycopersicon, Hordeum, Zea, Oryza, rye, wheat, 

25 radish and mung bean. In some embodiments of the method, the nucleic 

acid contains a nucleic acid sequence that facilitates amplification of a region 
of a plant chromosome or targets it to an amplifiabie region of a plant 
chromosome. 

In further embodiments of methods of producing plant artificial 
30 chromosomes provided herein, the nucleic acid that is introduced into a cell 
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containing one or more plant chromosomes includes nucleic acid that for 
identification of cells containing the nucleic acid. Such nucleic acids include 
nucleic acid encoding a fluorescent protein, such as a green, blue or red 
fluorescent protein, and nucleic acid encoding a selectable marker, such as, 
5 for example, proteins that confer resistance to phosphinothricin, ammonium 
. glufosinate, glyphosate, kanamycin, hydromycin, dihydrofolate or 
sulfonylurea. 

In embodiments of methods of producing plant artificial chromosomes 
in which nucleic acid is introduced into a cell containing one or more plant 

10 chromosomes, the ceil can be cultured through two or more cell doublings, 
and typically from about 5 to about 60, or about 5 to about 55, or about 10 
to about 55, or about 25 to about 55, or about 35 to about 55 cell doublings 
following introduction of nucleic acid into a cell. The step of selecting a cell 
containing a plant artificial chromosome can include sorting of cells into 

15 which nucleic acid was introduced. For example, cells can be sorted on the 
basis of the presence of a selectable marker, such as a reporter protein, or 
by growing (culturing) the cells under selective conditions. The selection 
step can include fluorescent in situ hybridization (FISH) analysis of cells into 
which nucleic acid is introduced. 

20 Also provided are methods of producing a transgenic plant using 

artificial chromosomes that function in plants and transgenic plants 
containing artificial chromosomes. Artificial chromosomes used in the 
methods of producing transgenic plants can be of any species. For example, 
the artificial chromosomes can contain a centromere from species such as 

25 animals, e.g., mammals, birds, plants, or insects, that functions to segregate 
nucleic acids to daughter cells through cell division. In some embodiments 
of the methods for producing a transgenic plant, the artificial chromosomes 
contain repeat regions predominantly made up of repeats of one or more 
nucleic acid units. Repeats of a nucleic acid unit can share some common 

30 nucleic acid sequences, for example, a replication site involved in 
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amplification of chromosome segments and/or some heterologous nucleic 
acid. Repeats of a nucleic acid unit can be substantially identical. Common 
nucleic acid sequences of repeats of a nucleic acid unit can contain 
sequences that represent euchromatic and heterochromatic nucleic acid. 
5 Repeat regions of artificial chromosomes that can be used in the 

methods of producing a transgenic plant can be made up of substantially 
equivalent amounts of heterochromatic and euchromatic DNA or can be 
made up predominantly of heterochromatic DNA or can be made up 
predominantly of euchromatic DNA. The artificial chromosome can be made 

10 up predominantly of heterochromatic or euchromatic DNA or can be made up 
of substantially equivalent amounts of heterochromatin and euchromatin. 
Such artificial chromosomes that contain plant centromeres can contain a 
plant centromere from any species of pfant, including monocots and dicots. 
For example, the centromere can be from Arabidopsis, tobacco, Helianthus, 

15 Solatium, Lycopersicon , Daucus, Hordeum, Zea, Brass/ca, Triticum, rye, 
wheat, radish, mung bean or Oryza. The artificial chromosomes can be 
made using methods described herein. 

In a method of producing a transgenic plant provided herein, an 
artificial chromosome, such as those described above and elsewhere herein, 

20 is introduced into a plant cell. The artificial chromosome can contain 

heterologous nucleic acid encoding a gene product such as, for example, an 
enzyme, antisense RNA, tRNA, rDNA, a structural protein, a marker or 
reporter protein, a ligand, a receptor, a ribozyme, a therapeutic protein, a 
biopharmaceutical protein, a vaccine, a blood factor, an antigen, a hormone, 

25 a cytokine, a growth factor or an antibody. The product can be one that 

provides for resistance to diseases, insects, herbicides or stress in the plant. 
The product can be one that provides for an agronomically important trait in 
the plant and/or that alters the nutrient utilization and/or improves the 
nutrient quality of the plant. Heterologous nucleic acid of an artificial 
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chromosome can be contained within a bacterial artificial chromosome (BAC) 
or a yeast artificial chromosome (YAC). 

The plant cell into which such artificial chromosomes can be 
introduced in methods of producing a transgenic plant provided herein can be 
5 any species of plant cell, including, but not limited to, Arabidopsis, tobacco, 
Helianthus, Solanum, Lycopersicon, Daucus, Hordeum, Zea, Brassica, 
Triticum, rye, wheat, radish, mung bean, Capsicum, lentil and Oryza. Any 
cell that can develop into a plant can be used, including plant cells and 
protoplasts of plant embryos, calli, tissues, meristem, organs, seeds, 

10 seedlings, pollen, pollen tubes or whole plants. 

Artificial chromosomes can be introduced into plant cells in the 
methods of producing a transgenic plant using any process for transfer of 
nucleic acids into plant cells, including, but not limited to chemical, physical 
and electrical processes and combinations thereof. For example, the artificial 

15 chromosomes can be transferred into plant cells via direct contact in the 
absence or presence of a fusogen, e.g., polyethylene glycol (PEG), calcium 
phosphate and/or lipid or they can be encapsulated in a lipid structure {e.g., a 
liposome) or contained within a protoplast or microcell which is then allowed 
to fuse (in the presence or absence of a fusogen such as PEG) with a plant 

20 cell for introduction of the artificial chromosome into the cell in a method of 
producing a transgenic plant. Artificial chromosomes can be transferred to 
plant cells that are subjected to electrical pulses (e.g., electroporation) and/or 
ultrasound (e.g., sonoporation) before, during and/or after exposure of the 
cells to the artificial chromosomes. Use of electrical pulses and/or ultrasound 

25 can be in combination with any other agents, e.g., PEG and/or lipids, used in 
transferring nucleic acids into plant cells. Artificial chromosomes can also be 
physically injected into plant cells through a micropipette or needle or 
introduced into plant cells through bombardment of the cells with 
microprojectiles coated with the chromosomes. To facilitate transfer of 
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nucleic acids into plant cells, the recipient cells or tissue can be subjected to 
mechanical wounding. 

Plant cells into which artificial chromosomes have been introduced for 
purposes of producing a transgenic plant are cultured under conditions that 
5 permit generation of a whole plant therefrom. The transformed cells can be 
analyzed prior to use in the generation of whole plants to determine 
suitability. For example, the cells can be analyzed for the presence of 
artificial chromosomes and/or regenerative capacity. Plant regeneration 
techniques, many of which are known to those of skill in the art, can be 
10 used to generate whole plants from, for example, cells, embryos and calli 
containing artificial chromosomes. For example, plants can be regenerated 
from cells containing artificial chromosomes by the planting of transformed 
roots, plantlets, seed, seedlings, and any structure capable of growing into a 
whole plant. 

15 Further provided herein are methods for producing an acrocentric plant 

chromosome and methods for producing plant chromosomes containing 
adjacent regions of rDNA and heterochromatin, in particular, pericentric 
and/or satellite heterochromatin. Also provided herein are methods for 

generating acrocentric plant chromosomes containing adjacent regions of 

i 

20 heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
and rDNA on the short arm of the chromosome. 

One embodiment of these methods includes steps of introducing 
nucleic acid containing two site-specific recombination sites into a cell 
containing one or more plant chromosomes, recombining nucleic acids of the 

25 two site-specific recombination sites, and selecting a cell containing an 
acrocentric plant chromosome and/or a plant chromosome containing 
adjacent regions of rDNA and heterochromatin. The two site-specific 
recombination sites can be contained on separate nucleic acid fragments 
which are introduced into the cell simultaneously or sequentially. 
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Other embodiments of the methods of producing an acrocentric plant 
chromosome and/or a plant chromosome that contains adjacent regions of 
rDNA and heterochromatin include steps of introducing a first nucleic acid 
containing a site-specific recombination site into a first plant chromosome, 
5 introducing a second nucleic acid containing a site-specific recombination 
site into a second plant chromosome, recombining nucleic acids of the first 
and second chromosomes and selecting a plant chromosome that is 
acrocentric or that contains adjacent regions of rDNA and heterochromatin. 
For example, to produce an acrocentric plant chromosome, the first nucleic 

10 acid can be introduced into or adjacent to the pericentric heterochromatin of 
the first chromosome and/or the second nucleic acid can be introduced into 
the distal end of the arm of the second chromosome. To produce an 
acrocentric plant chromosome containing adjacent regions of rDNA and 
heterochromatin, for example, the first nucleic acid can be introduced into or 

15 adjacent the pericentric heterochromatin on the short arm of an acrocentric 
plant chromosome and the second nucleic acid can be introduced into or 
adjacent to rDNA. To produce a plant chromosome containing adjacent 
regions of rDNA and heterochromatin, for example, the first nucleic acid can 
be introduced into or adjacent to heterochromatin, such as pericentric 

20 heterochromatin or satellite DNA, and the second nucleic acid can be 

introduced into or adjacent to rDNA. When the chromosomes are located 
within a cell, the method can include selecting a cell containing a plant 
chromosome that is acrocentric and/or that contains adjacent regions of 
rDNA and heterochromatin. 

25 Another embodiment of the methods of producing an acrocentric plant 

chromosome includes steps of introducing a first nucleic acid containing a 
site-specific recombination site into the pericentric heterochromatin of a plant 
chromosome, introducing a second nucleic acid containing a site-specific 
recombination site into the distal end of the chromosome in which the first 

30 and second recombination sites are located on the same arm of the 
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chromosome, recombining nucleic acids of the first and second 
recombination sites in the chromosome and selecting a plant chromosome 
that is acrocentric. 

Another method of producing an acrocentric plant chromosome or a 
5 plant chromosome containing adjacent regions of rDNA and heterochromatin 
includes steps of introducing nucleic acid containing a recombination site 
adjacent to or sufficiently near nucleic acid encoding a selectable marker into 
a first plant cell for recombination and introduction of the marker into the 
chromosome, generating a first transgenic plant from the first plant cell, 

10 introducing nucleic acid containing a promoter functional in a plant cell and a 
recombination site in operative linkage into a second plant cell, generating a 
second transgenic plant from the second plant cell, crossing the first and 
second plants, obtaining plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker, and selecting a 

1 5 resistant plant that contains cells containing an acrocentric plant 

chromosome or a plant chromosome containing adjacent regions of rDNA 
and heterochromatin. Methods of this embodiment can optionally include 
steps of selecting first and second transgenic plants such that one of the 
plants contains a chromosome containing a recombination site in a region 

20 within or adjacent to the pericentric heterochromatin and the other plant 
contains a chromosome containing a recombination site located within or 
adjacent to rDNA of the chromosome. These methods can further include 
the steps of selecting first and second transgenic plants where one of the 
plants contains a chromosome containing a recombination site located on a 

25 short arm of the chromosome in a region adjacent to the pericentric 
heterochromatin; and 

the other plant contains a chromosome containing a recombination site 
located in rDNA of the chromosome. In one embodiment, the recombination 
sites on the two chromosomes are in the same orientation. 
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ln methods of producing an acrocentric plant chromosome, one or 
both of these recombination sites is located on a short arm of the 
chromosome. For example, one of the one of the plants contains a 
chromosome containing a recombination site in region within or adjacent to 
5 the pericentric heterochromatin located on the short arm of the chromosome. 
The selecting steps can further include selecting first and second transgenic 
plants such that the recombination sites on the two chromosomes are in the 
same orientation. 

In any of these methods of producing an acrocentric plant 

10 chromosome or a plant chromosome containing adjacent regions of rDNA 
and heterochromatin (in particular, pericentric heterochromatin and/or 
satellite DNA), recombination between the first and second site-specific 
recombination sites can be provided for in a number of ways. For example, a 
recombinase activity can be introduced into a cell containing one or more 

15 chromosomes containing the sites which catalyzes the recombination 

reaction. The recombinase activity can be encoded by nucleic acid that is 
introduced into the cell simultaneously with nucleic acid containing a site- 
specific recombination site or that is introduced into the cell at a different 
time. Recombinase activity occurs within the cell upon expression of the 

20 nucleic acid encoding a recombinase activity, which can be operatively linked 
to a promoter functional in the cell. The recombinase activity can be 
constitutively expressed or can be induced, for example, by linking the 
nucleic acid encoding the recombinase to an inducible promoter. It is also 
possible that a cell into which nucleic acid containing site-specific 

25 recombination sites is introduced contains a recombinase enzyme which can 
be constitutively or inducibly expressed. Alternatively, a transgenic plant can 
be generated from cells containing the recombination sites and crossed with 
a transgenic plant containing nucleic acid encoding a recombinase. 

Any site-specific recombinase system known to those of skill in the 

30 art is contemplated for use herein. It is contemplated that one or a plurality 
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of sites that direct the recombination by the recombinase are introduced into 
the ACes (or other AGs) and then heterologous genes linked to the cognate 
site are introduced into an ACes to produce platform ACes. The resulting 
ACes are introduced into cells with nucleic acid encoding the cognate 
5 recombinase, typically on a vector, and nucleic acid encoding heterologous 
nucleic acid of interest linked to the appropriate recombination site for 
insertion into the ACes chromosome. The recombinase encoding nucleic 
acid may be introduced into the AC, includes ACes, or on the same or a 
difference vector from the heterologous nucleic acid. 

10 For the methods herein any recombinase enzyme that catalyzes site- 

specific recombination can be used to facilitate recombination between the 
first and second site-specific recombination sites. A variety of recombinases 
and attachment/recombination sites therefor are available and/or known to 
those of skill in the art. These include, but not limited to: the Cxellox 

15 recombination system using CRE recombinase from the Escherichia coli 

phage P1 , the FLP/FRT system of yeast using the FLP recombinase from the 
2y episome of Saccharomyces cerevisiae, the resolvases, including Gin 
recombinase of phage Mu, Cin, Hin, aS Tn3; the Pin recombinase of E. cofi, 
the R/RS system of the pSR1 plasmid of Zygosaccharomyces rouxii site 

20 specific recombinases from Kluyveromyces drosopbilarium and 
Kluyveromyces waltii and other systems are 

Also contempalted is the E. coli phage lambda integrase system, the phage 
lambda integrase and the cognate att sites (see, also copending application 
U.S. application Serial No. (attorney docket No. 24601-420, filed on the 

25 same day herewith)). 

In any of these methods of producing acrocentric plant chromosomes, 
nucleic acid containing a site-specific recombination site can also contain 
nucleic acid encoding a selectable marker. The nucleic acids used in the 
methods can be designed such that expression of the selectable marker 

30 occurs only upon the desired recombination event. 
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Acrocentric plant chromosomes produced by the methods provided 
herein can be of any composition. For example, the DNA of the short arm of 
the acrocentric chromosome can contain less than 5% or less than 1 % 
euchromatic DNA or can contain no euchromatic DNA. Acrocentric plant 
5 artificial chromosomes in which the short arm of the acrocentric chromosome 
does not contain euchromatic DNA are provided. 

In another embodiment, a method of producing a plant artificial 
chromosome, that includes the steps of introducing nucleic acid into a plant 
cell acrocentric chromosome in which the short arm does not contain 
10 euchromatic DNA; culturing the cell through at least one cell division; and 
selecting a cell containing an artificial chromosome, such as one that is 
predominantly heterochromatic, is provided. The acrocentric chromosome is 
produced by the method of any the methods described herein or other 
suitable methods. 

15 in another embodiment, a method for producing an artificial 

chromosome, that includes the steps of introducing nucleic acid into a plant 
cell; and 

selecting a plant cell that includes an artificial chromosome that contains one 
or more repeat regions is provided. In this AC, one or more nucleic acid 

20 units is (are) repeated in a repeat region; repeats of a nucleic acid unit have 
common nucleic acid sequences; and the common sequences of 
nucleotides include sequences that represent euchromatic and 
heterochromatic nucleic acid. The nucleic acid can include plant rDNA from 
a dicot plant species or plant rDNA from a monocot plant species. The 

25 intergenic spacer region can be from DNA from a Nicotians plant or other 
suitable source of such DNA. The rDNA can be plant rDNA, and the plant 
can be a dicot or a monocot. 

Also provided are isolated plant artificial chromosomes that contain 
one or more repeat regions. In these AGs one or more nucleic acid units is 

30 (are) repeated in a repeat region; repeats of a nucleic acid unit have common 
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nucleic acid sequences; and the common sequences of nucleotides include 
sequences that represent euchromatic and heterochromatic nucleic acid. The 
artificial chromosome can be produced by a method that includes the steps 
of: introducing nucleic acid into a plant cell; and selecting a plant cell 
5 containing an artificial chromosome that contains one or more repeat regions. 
The repeats of a nucleic acid unit have common nucleic acid sequences; and 
the common nucleic acid sequences contain sequences that represent 
euchromatic and heterochromatic nucleic acid. 

In another embodiment, another method for producing an acrocentric 

10 plant chromosome is provided. The method includes the steps of: 

introducing nucleic acid containing two site-specific recombination sites into 
a cell containing one or more plant chromosomes; introducing into the cell a 
recombinase activity that catalyzes recombination between the two 
recombination sites to produce a plant acrocentric chromosome. In the 

1 5 embodiment, the two site-specific recombination sites can be on separate 
nucleic acid fragments, which optionally can be introduced into the cell 
simultaneously or sequentially. The resulting artificial chromosome can be 
one that is predominantly heterochromatic. 

In another embodiment, a method of producing a plant artificial 

20 chromosome is provided. The method includes the steps of: introducing 
nucleic acid into a plant chromosome, such as but not limited to, an 
acrocentric chromosome, in a cell that contains adjacent regions of rDNA and 
heterochromatic DNA; culturing the cell through at least one cell division; 
and selecting a cell containing an artificial chromosome. The resulting 

25 artificial chromosome can be predominantly heterochromatic. The 

acrocentric chromosome can be one where the short arm of the chromosome 
contains adjacent regions of rDNA and heterochromatic DNA, such as, but 
not limited to, pericentric heterochromatin. 

Also provided are a variety of vectors. Among these are vectors 

30 containing nucleic acid encoding a selectable marker that is not operably 
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associated with any promoter, wherein the selectable marker permits growth 
of animal cells in the presence of an agent normally toxic to the animal cells; 
and wherein the agent is not toxic to plant cells; a recognition site for 
recombination; and a sequence of nucleotides that facilitates amplification of 
5 a region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome. Exemplary of such vectors is pAglla and pAgllb. 

Another vector provided herein contains nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, wherein 
the selectable marker permits growth of animal cells in the presence of an 
10 agent normally toxic to the animal cells; and wherein the agent is not toxic to 
plant cells; a recognition site for recombination; and nucleic acid encoding a 
protein operably linked to a plant promoter. Exemplary of these vectors is 
pAg1 and pAg2. 

Another vector that is provided contains: nucleic acid encoding a 

15 selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of plant cells in the presence of an 
agent normally toxic to the plant cells but not toxic to animal cells; a 
recognition site for recombination; and nucleic acid encoding a protein 
operably linked to a plant promoter. 

20 Another vector is a plant transformation vector that contains nucleic 

acid encoding a recognition site for recombination; a sequence of nucleotides 
that facilitates or causes amplification of a region of a plant chromosome; 
one or more selectable markers that are expressed in plant cells to permit the 
selection of cells containing the vector, and Agrobacterium nucleic acid. The 

25 vector is for Agrobacterium-medlated transformation of plants. 

Another vector that is provided contains a recognition site for 
recombination; and a sequence of nucleotides that facilitates amplification of 
a region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome, wherein the plant is selected from the group 

30 consisting of Arabidopsis, Nicotiana, Solanum, Lycopersicon , Daucus, 
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Hordeum, Zea mays, Brassfca, Triticum, He/ianthus, soybean, cotton and 
Oryza. 

In these vectors, the amplifiable region can contain heterochromatic 
nucleic acid; the amplifiable region can contain rDNA. Exemplary sequences 
5 of nucleotides that facilitates amplification of a region of a plant chromosome 
or targets the vector to an amplifiable region of a plant chromosome are any 
that contain a sufficient portion of an intergenic spacer region of rDNA to 
facilitate amplification or effect the targeting. Such sufficient portion can be 
at least 14, 20, 30, 50, 100, 150, 300, 500, 1 kB, 2 kB, 3 kB, 5 kB, 10 kB 

10 or more contiguous nucleotides from an intergenic spacer region and/or other 
rDNA region. An exemplary selectable marker encodes a product confers 
resistance to zeomycin. The protein in the vectors include a protein that is a 
selectable marker that permits growth of plant cells in the presence of an 
agent normally toxic to the plant cells, such as, for example, resistance to 

15 hygromycin or to phosphothricin. Other such protein markers include, but 
are not limited to, fluorescent proteins, such as, for example, green, blue 
and red fluorescent proteins. An exemplary recognition site contains an att 
site. Exemplary promoters for inclusion in the vectors, include, but are not 
limited to, nopaline synthase (NOS) or CaMV35S. 

20 Cell, containing any of the vectors or mixtures thereof are provided. 

The cells include any cells that have at least one plant chromosome, such as 
a plant cell. The cells can be protoplasts. 

Methods using these vectors are provided. The methods includes a 
step of introducing one of the vectors into a cell, such as a cell that 

25 contains at least one plant chromosome. Such vector is for example, a 
vector that contains nucleic acid encoding a selectable marker that is not 
operably associated with any promoter, where the selectable marker permits 
growth of animal cells in the presence of an agent normally toxic to the 
animal cells but is not toxic to plant cells; a recognition site for 

30 recombination; and 



WO 02/096923 



PCTYUS02/17451 



-22- 

nucleic acid encoding a protein operably linked to a plant promoter. In this 
method, the cell contains an animal, such as a mammal, platform ACes that 
contains a recognition site, such as, for example, an att site, that recombines 
with the recognition site in the vector in the presences of the recombinase 
5 therefor, thereby incorporating the selectable marker that is not operably 
associated with any promoter and the nucleic acid encoding a protein 
operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. The platform ACes can contain a promoter that, 
upon recombination, is operably linked to the selectable marker that in the 

10 vector is not operably associated with a promoter. The method can further 
include transferring the resulting platform ACes into a plant cell to produce a 
plant cell that contains the platform Aces. The method optionally further 
includes culturing the plant cell that contains the platform Aces under 
conditions whereby the protein encoded by the nucleic acid that is operably 

15 linked to a plant promoter is expressed. 

The resulting platform ACes optionally is isolated prior to transfer. 
The Aces can be introduced into a plant cell by any suitable method, such as 
one selected from among protoplast transfection, lipid-mediated delivery, 
liposomes, electroporation, sonoporation, microinjection, particle 

20 bombardment, silicon carbide whisker-mediated transformation, polyethylene 
glycol (PEG)-mediated DNA uptake, lipofection and lipid-mediated carrier 
systems. The resulting platform ACes can be transferred by fusion of the 
cells, which, for example, are plant protoplasts. In another embodiment, the 
cell can be an animal cell, such as a mammalian, including human, cell. 

25 

In another, method a vector is introduced into plant cells. Such 
vector, for example, can be a vector that includes nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of animal cells in the presence of an 
30 agent normally toxic to the animal cells but is not toxic to plant cells; a 
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recognition site for recombination; and a sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the 
vector to an amplifiable region of a plant chromosome. The plant cells are 
cultured and a plant cell(s) containing an artificial chromosome that contains 
5 one or more repeat regions is selected. In this method, a sufficient portion of 
the vector can integrates into a chromosome in the plant cell to result in 
amplification of chromosomal DNA. The resulting selected artificial 
chromosome can be on in which one or more nucleic acid units is (are) 
repeated in a repeat region; repeats of a nucleic acid unit have common 
10 nucleic acid sequences; and the repeat region(s) contain substantially 

equivalent amounts of euchromatic and heterochromatic nucleic acid. The 
resulting artificial chromosome produced in the method optionally can be 
isolated. 

Anther method is also provided. This method includes the steps of 

15 introducing a vector into a cell, and culturing the resulting cell under 

conditions, whereby the protein encoded by nucleic acid operably linked to 
an animal promoter is expressed. In the method the vector can contains: 
nucleic acid encoding a selectable marker that is not operably associated 
with any promoter, where the selectable marker permits growth of animal 

20 cells in the presence of an agent normally toxic to the animal ceils but is not 
toxic to plant cells; a recognition site for recombination; and nucleic acid 
encoding a protein operably linked to an animal promoter. The cell can 
contain a platform plant artificial chromosome (PAC) that contains a 
recombination site and an animal promoter that upon recombination is 

25 operably linked to the selectable marker that in the vector is not operably 
associated with a promoter. Introduction can be effected under conditions 
whereby the vector recombines with the PAC to produce a plant platform 
PAC that contains the selectable marker operably linked to the promoter. In 
this method, the artificial chromosome can be an ACes. In addition, the 

30 plant platform PAC can be an ACes. 
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The vectors, such as those that contain nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of animal cells in the presence of an 
agent normally toxic to the animal cells but is not toxic to plant cells; a 
5 recognition site for recombination; and a sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the 
vector to an amplifiable region of a plant chromosome, and the plant 
transformation vectors that contain nucleic acid for Agrobacterium-medlated 
transformation of plants, can be used to produce artificial chromosomes. In 
10 one exemplary method, such vector is introduced into a cell containing one 
or more plant chromosomes; and 

a cell containing an artificial chromosome that contains one or more repeat 
regions is selected. The artificial chromosome contains one or more nucleic 
acid units that is (are) repeated in a repeat region; the repeats of a nucleic 

15 acid unit have common nucleic acid sequences; and the common nucleic acid 
sequences contain sequences that represent euchromatic and 
heterochromatic nucleic acid. In another method, a ceil containing an 
artificial chromosome that contains one or more repeat regions is selected. 
The artificial chromosome contains one or more nucleic units that is (are) 

20 repeated in a repeat region; repeats of a nucleic acid unit have common 
nucleic acid sequences; and 

the repeat region(s) contain substantially equivalent amounts of euchromatic 
and heterochromatic nucleic acid. 
DESCRIPTION OF THE DRAWINGS 
25 Figure 1 provides a map of plasmid pAgl. 

Figure 2 provides a schematic representation of the construction of 
plasmid pAgl. 

Figure 3 provides a map of plasmid pAg2. 

Figure 4 provides a schematic representation of the construction of 
30 plasmid pAg2. 
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Figure 5 provides a schematic representation of the construction of 
plasmids pAglla and pAgllb. 

Figure 6A-6B provide restriction maps of the DNA inserted into pAg1 
to form plasmids pAglla and pAgllb. 
5 Figure 7 provides a map of plasmid pSV401 93attPsensePUR. 

Figure 8 depicts a method for formation of a chromosome platform 
with multiple recombination integration sites, such as attP sites. 

Figure 9 diagrammatically summarizes the platform technology; 
marker 1 permits selection of the artificial chromosomes containing the 
10 integration site; marker 2, which is promoterless in the donor vector permits 
selection of recombinants. Upon recombination with the platform marker 2 
is expressed under the control of a promoter resident on the platform. 
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

Definitions 

15 Unless defined otherwise, all technical and scientific terms used herein 

have the same meaning as is commonly understood by one of skill in the art 
to which this invention belongs. All patents, patent applications, published 
applications and other publications and published nucleotide and amino acid 
sequences (e.g., sequences available in GenBank or other databases) referred 

20 to herein are incorporated by reference in their entirety. Where reference is 
made to a URL or other such identifier or address, it is understood that such 
identifiers can change and particular information on the internet can come 
and go, but equivalent information can be found by searching the internet. 
Reference thereto evidences the availability and public dissemination of such 

25 information. 

As used herein, a chromosome is a defined composition of nucleic 
acid that is capable of replication and segregation within a cell upon cell 
division. Typically, a chromosome may contain a centromeric region, 
telomeric regions and a region of nucleic acid between the centromeric and 

30 telomeric regions. 
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As used herein, a centromere is a molecular composition that includes 
a nucleic acid sequence that confers an ability to segregate to daughter cells 
through cell division. A centromere may confer stable segregation of a 
nucleic acid sequence, including an artificial chromosome containing the 
5 centromere, through mitotic and/or meiotic divisions. A plant centromere is 
not necessarily derived from plants, but has the ability to promote DNA 
segregation in plant cells. 

As used herein, euchromatin and heterochromatin have their 
recognized meanings. Euchromatin refers to chromatin that stains diffusely 

10 and that typically contains genes, and heterochromatin refers to chromatin 
that remains unusually condensed and that has been thought to be 
transcriptionally inactive or has low transcriptional activity relative to 
euchromatin. Highly repetitive DNA sequences (satellite DNA) are usually 
located in regions of the heterochromatin surrounding the centromere 

15 (pericentric or pericentromeric heterochromatin). Constitutive 

heterochromatin refers to heterochromatin that contains the highly repetitive 
DNA which is constitutively condensed and genetically inactive. 

As used herein, an acrocentric chromosome refers to a chromosome 
with arms of unequal length. 

20 As used herein, endogenous chromosomes refer to genomic chromo- 

somes as found in the cell prior to generation or introduction of an artificial 
chromosome. 

As used herein, artificial chromosomes are nucleic acid molecules, 
typically DNA, that stably replicate and segregate alongside endogenous 

25 chromosomes in cells and have the capacity to accommodate and express 
heterologous genes contained therein. A mammalian artificial chromosome 
(MAC) refers to a chromosome that has an active mammalian centromere(s). 
Plant artificial chromosomes (PAC), insect artificial chromosomes and avian 
artificial chromosomes refer to chromosomes that include centromeres that 

30 function in plant, insect and avian cells, respe ctively. Human artificial 
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chromosomes (HAC) refers to chromosomes that include centromeres that 
function in human cells. For exemplary artificial chromosomes, see, e.g., 
U.S. Patent Nos. 6,025,155; 6,077,697; 5,288,625; 5,712,134; 
5,695,967; 5,869,294; 5,891,691 and 5,721,118 and published 
5 International PCT application Nos, WO 97/40183 and WO 98/08964. 

As used herein, amplification, with reference to DNA, is a process in 
which segments of DMA are duplicated to yield two or multiple copies of 
substantially similar or identical or nearly identical DNA segments that are 
typically joined as substantially tandem or successive repeats or inverted 
10 repeats. 

As used herein, amplification-based artificial chromosomes are 
artificial chromosomes derived from natural or endogenous chromosomes by 
virtue of an amplification event, such as one that may be initiated by 
introduction of heterologous nucleic acid into heterochromatin, for example, 

15 pericentric heterochromatin, in a chromosome. As a result of such an event, 
chromosomes and/or fragments thereof exhibiting segmented or repeating 
patterns arise. Artificial chromosomes can be formed from these 
chromosomes and fragments. Hence, amplification-based artificial 
chromosomes refer to non-natural or isolated chromosomes that exhibit an 

20 ordered segmentation that is not typically observed in naturally occurring 
chromosomes and that can be a basis for distinguishing them from naturally 
occurring chromosomes. Amplification-based artificial chromosomes can 
also be distinguished from naturally occurring chromosomes by virtue of their 
typically smaller size and often segmented appearance when visualized. The 

25 segmented appearance, which can be visualized using a variety of 

chromosome analysis techniques as described herein and known to those of 
skill in the art, correlates with the unique structure of these artificial 
chromosomes. In addition to containing one or more centromeres, the 
amplification-based artificial chromosomes, throughout the region or regions 

30 of segmentation, are predominantly made up of one or more nucleic acid 
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units, also referred to as "amplicons", that is (are) repeated in the region and 
that have a similar gross structure. Thus, a region of segmentation may be 
referred to as a repeat region. Repeats of an amplicon tend to be of similar 
size and share some common nucleic acid sequences. For example, each 
5 repeat of an amplicon may contain a replication site involved in amplification 
of chromosome segments and/or some heterologous nucleic acid that was 
utilized in the initial production of the artificial chromosome. Typically, the 
repeating units are substantially similar in nucleic acid composition and may 
be nearly identical. The common nucleic acid sequences may contain 

10 sequences that represent euchromatic and heterochromatic nucleic acid. 
Amplicon sizes vary but typically tend to be greater than about 100 kb, 
greater than about 500 kb, greater than about 1 Mb, greater than about 5 
Mb or greater than about 10 Mb. The composition of the amplification-based 
artificial chromosomes may be such that substantially the entire chromosome 

15 exhibits a segmented appearance or such that only one or more portions that 
make-up less than the entire chromosome appear segmented. The 
amplification-based artificial chromosomes can also differ depending on the 
chromosomal region that has undergone amplification in the process of 
artificial chromosome formation. The structures of the resulting 

20 chromosomes can vary depending upon the initiating event and/or the 

conditions under which the heterologous nucleic acid is introduced, including 
modification to the endogenous chromosomes. For example, in some of the 
artificial chromosomes provided herein, the region or regions of segmentation 
may be made up predominantly of heterochromatic DNA. In other artificial 

25 chromosomes provided herein, the region or regions of segmentation may be 
made up predominantly of euchromatic DNA or may be made up of similar 
amounts of heterochromatic and euchromatic DNA. The region or regions of 
segmentation thus may be entirely heterochromatic (while still containing one 
or more heterologous nucleic acid sequences), or may contain increasing 

30 amounts of euchromatic DNA, such that, for example, the region contains 
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about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater than 
90% euchromatic DNA. Because the entire artificial chromosome can be 
made up predominantly of a region or regions of segmentation, it is thus 
possible for the artificial chromosome to be made up predominantly of 
5 heterochromatin or euchromatin, or to be made up of substantially equivalent 
amounts of heterochromatin and euchromatin, e.g., about 40% to about 
50% of one type of nucleic acid and about 50% to about 60% of the other 
type of nucleic acid. 

As used herein the term "predominantly" with respect to a 

10 composition generally refers to a state of the composition in which it can be 
characterized as being or having more of the predominant feature than other 
features which are not predominant. The predominant feature may represent 
more than about 50%, more than about 60%, more than about 70%, more 
than about 80%, more than about 90%, more than about 95% or essentially 

15 100% of the composition. Thus, for example, a repeat region that is 
predominantly made up of heterochromatic DNA contains more 
heterochromatic DNA than other types, e.g., euchromatic, of DNA, The 
repeat region may be more than about 50%, more than about 60%, more 
than about 70%, more than about 80%, more than about 90% or more than 

20 about 95% heterochromatic DNA or may be essentially 100% • 

heterochromatic DNA. An artificial chromosome predominantly made up of 
heterochromatin contains more heterochromatic DNA than other types, e.g., 
euchromatic, of DNA and may be more than about 50%, more than about 
60%, more than about 70%, more than about 80%, more than about 90% 

25 or more than about 95% heterochromatic DNA or may be essentially 100% 
heterochromatic DNA. 

As used herein an amplicon is a repeated nucleic acid unit. In some of 
the artificial chromosomes described herein, an amplicon may contain a set 
of inverted repeats of a megareplicon. A megareplicon represents a higher 

30 order replication unit. For example, with reference to some of the 
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predominantly heterochromatic artificial chromosomes, particularly eukaryotic 
chromosomes, described herein, the megareplicon may contain a set of 
tandem DNA blocks (e.g., — 7.5 Mb DNA blocks) each containing satellite 
DNA flanked by non-satellite DNA or may substantially be made up of rDNA. 
5 Contained within the megareplicon is a primary replication site, referred to as 
the megareplicator, which may be involved in organizing and facilitating 
replication of segments of chromosomes, including, for example, 
heterochromatin, pericentric heterochromatin, rDNA and/or possibly the 
centromeres. Within the megareplicon there may be smaller (e.g., 50-300 

10 kb) secondary replicons. As used herein, amplifiable, when used in 

reference to a chromosome, particularly the method of generating artificial 
chromosomes provided herein, refers to a region of a chromosome that is 
prone to amplification. Amplification typically occurs during replication and 
other cellular events involving recombination (e.g., DNA repair). Included 

1 5 among such regions are regions of the chromosome that contain tandem 
repeats, such as satellite DNA, rDNA, and other such sequences. 

Among the artificial chromosome systems provided herein are those 
that are predominantly heterochromatic [formerly referred to as satellite 
artificial chromosomes (SATACs); see, e.g., U.S. Patent Nos. 6,077,697 

20 and 6,025,155 and published International PCT application No. 

WO 97/40183], minichromosomes which contain a de novo centromere, 
artificial chromosomes containing one or more regions of repeating nucleic 
acid units wherein the repeat region(s) contain substantially equivalent 
amounts of euchromatic and heterochromatic nucleic acid and In vitro 

25 assembled artificial chromosomes. Of particular interest herein are artificial 
chromosomes that introduce and express heterologous nucleic acids in 
plants. These include artificial chromosomes that have a centromere derived 
from a plant, and, also, artificial chromosomes that have centromeres that 
may be derived from other organisms but that function in plants. Methods 
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for the construction, isolation, and delivery to target cells of each type of 
artificial chromosome are provided herein* 

As used herein, to target nucleic acid to a locus on a chromosome 
means that the nucleic acid integrates at or near the targeted locus. Any 
5 method or means for effecting such integration, including, but not limited to, 
homologous recombination, is contemplated. 

As used herein, a dicentric chromosome is a chromosome that 
contains two centromeres. A multicentric chromosome contains more than 
two centromeres. 

10 As used herein, a formerly dicentric chromosome is a chromosome 

that is produced when a dicentric chromosome fragments and acquires new 
telomeres so that two chromosomes, each having one of the centromeres, 
are produced. Each of the fragments are replicable chromosomes. If one of 
the chromosomes undergoes amplification of primarily euchromatic DNA to 

15 produce a fully functional chromosome that is predominantly (more than 
about 50%, more than about 70% or more than about 90% euchromatin) 
euchromatin, it is a minichromosome. The remaining chromosome is a 
formerly dicentric chromosome. If one of the chromosomes undergoes 
amplification, whereby heterochromatin (such as, for example, satellite DNA) 

20 is amplified and a euchromatic portion (such as, for example, an arm) 

remains, it is referred to as a sausage chromosome. A chromosome that is 
substantially all heterochromatin, except for portions of heterologous DNA, is 
called a predominantly heterochromatic artificial chromosome. Predominantly 
heterochromatic artificial chromosomes can be produced from other partially 

25 heterochromatic artificial chromosomes by culturing the cell containing such 
chromosomes under conditions that destabilize the chromosome and/or under 
selective conditions so that a predominantly heterochromatic artificial 
chromosome is produced. For purposes herein, it is understood that the 
artificial chromosomes may not necessarily be produced in multiple steps, 

30 but may appear after the initial introduction of the heterologous DNA. 
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Typically, artificial chromosomes appear after about 5 to about 60, or about 
5 to about 55, or about 10 to about 55 or about 25 to about 55 or about 35 
to about 55 cell divisions following introduction of nucleic acid into a cell. 
Artificial chromosomes may, however, appear after only about 5 to about 1 5 
5 or about 10 to about 15 cell divisions. 

As used herein, the term "satellite DIMA-based artificial chromosome 
(SATAC)" is interchangable with the term "artificial chromosome expression 
system (ACes)". These artificial chromosomes (ACes) include those that are 
substantially all neutral non-coding sequences (heterochromatin) except for 

10 foreign heterologous, typically gene or protein-encoding, nucleic acid, that 
may be interspersed within the heterochromatin for the expression therein 
(see U.S. Patent Nos. 6,025,155 and 6,077,697 and International PCT 
application No. WO 97/40183), or that is in a single (ocus as provided 
herein. The delineating structural feature is the presence of repeating units, 

15 which are generally predominantly heterochromatin. The precise structure of 
the ACes will depend upon the structure of the chromosome in which the 
initial amplification event occurs; all share the common feature of including a 
defined pattern of repeating units. Generally ACes have more 
heterochromatin than euchromatin. Foreign nucleic acid molecules 

20 (heterologous genes) contained in these artificial chromosome expression 
systems can include any nucleic acid whose expression is of interest in a 
particular host cell. 

As used herein, an artificial chromosome that is predominantly 
heterochromatic U.e. f containing more heterochromatin than euchromatin, 

25 typically more than about 50%, more than about 60%, more than about 

70%, more than about 80% or more than about 90% heterochromatin) may 
be produced by introducing nucleic acid molecules into cells, particularly 
plant cells, and selecting cells that contain a predominantly heterochromatic 
artificial chromosome. Any nucleic acid may be introduced into cells in the 

30 methods of producing the artificial chromosomes. For example, the nucleic 
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acid may contain a selectable marker and/or a sequence that targets nucleic 
acid to a heterochromatic region of a chromosome, particularly a plant 
chromosome, such as in the pericentric heterochromatin, in the short arm of 
acrocentric chromosomes, rDNA or nucleolar organizing regions. Targeting 
5 sequences include, but are not limited to, lambda phage DNA and rDNA 
{e.g., a sequence of an intergenic spacer of rDNA), particularly plant rDNA, 
for production of predominantly heterochromatic artificial chromosomes in 
plant cells. 

After introducing the nucleic acid into cells, a cell containing a 

10 predominantly heterochromatic artificial chromosome is selected. Such cells 
may be identified using a variety of procedures. For example, repeating units 
of heterochromatic DNA of these chromosomes may be discerned by G- 
and/or C-banding and/or fluorescence in situ hybridization (FISH) techniques. 
Prior to such analyses, the cells to be analyzed may be enriched with 

15 artificial chromosome-containing cells by sorting the cells on the basis of the 
presence of a selectable marker, such as a reporter protein, or by growing 
(culturing) the cells under selective conditions. Selection of cells containing 
amplified nucleic acids may also be facilitated by use of techniques such as 
PCR and Southern blotting to identify cell lines with amplified regions. It is 

20 also possible, after introduction of nucleic acids into cells, to select cells that 
have a multicentric, typically dicentric, chromosome, a formerly multicentric 
(typically dicentric) chromosome and/or various heterochromatic structures 
and to treat them such that desired artificial chromosomes are produced. 
Conditions for generation of a desired structure include, but are not limited 

25 to, further growth under selective conditions, introduction of additional 
nucleic acid molecules and/or growth under selective conditions and 
treatment with destabilizing agents, and other such methods (see 
International PCT application No. WO 97/40183 and U.S. Patent Nos. 
6,025,155 and 6,077,697). 
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As used herein, heterologous and foreign are used interchangeably 
with respect to nucleic acid and refer to any nucleic acid, including DNA and 
RNA, that does not occur naturally as part of the genome in which it is 
present or which is found in a location or locations in the genome that differ 
5 from that in which it occurs in nature. Thus, heterologous or foreign nucleic 
acid that is not normally found in the host genome in an Identical context. It 
is nucleic acid that is not endogenous to the cell and has been exogenously 
introduced into the cefl. Examples of heterologous DNA include, but are not 
limited to, DNA that encodes a gene product or gene product(s) of interest, 

10 introduced for purposes of modification of the endogenous genes or for 
production of an encoded protein. For example, a heterologous or foreign 
gene may be isolated from a different species than that of the host genome, 
or alternatively, may be isolated from the host genome but operably linked to 
one or more regulatory regions which differ from those found in the 

15 unaltered, native gene. Other examples of heterologous DNA include, but 
are not limited to, DNA that encodes traceable marker proteins, and DNA 
that encodes a protein that confers an input trait including, but not limited to, 
herbicide, insect, or disease resistance or an output trait, including, but not 
limited to, oil quality or carbohydrate composition. Antibodies that are 

20 encoded by heterologous DNA may be secreted, sequestered, stored in an 
organ or tissue, accumulate in the cytoplasm or cellular organelles or 
expressed on the surface of the cell in which the heterologous DNA has been 
introduced. 

As used herein, a "selectable marker" is a composition that can be 
25 used to distinguish one cell from another cell. For example, a selectable 
marker may be a nucleic acid encoding a readily detected protein that has 
been introduced into some cells but not others. Detection of the expressed 
protein in cells facilitates identification of cells containing the marker nucleic 
acid by distinguishing them from cells that do not contain the nucleic acid. 
30 Thus, for example, a selectable marker may be a fluorescent protein, such as 
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green fluorescent protein (GFP), or ff-galactosidase {or a nucleic acid 
encoding either of these proteins). Selectable markers such as these, which 
are not required for cell survival and/or proliferation in the presence of a 
selection agent, may also be referred to as reporter molecules. Other 
5 selectable markers, e.g., the neomycin phosphotransferase gene, provide for 
isolation and identification of cells containing them by conferring properties 
on the cells that make them resistant to an agent, e.g., a drug such as an 
antibiotic, that inhibits proliferation of cells that do not contain the marker. 
As used herein, growth under selective conditions means growth of a 
10 cell under conditions that require expression of a selectable marker for 
survival. 

As used herein, an agent that destabilizes a chromosome is any agent 
known by those of skill in the art to enhance amplification events, and/or 
mutations. Such agents, which include BrdU, are well known to those of 

1 5 skill in the art. 

In order to generate an artificial chromosome containing a particular 
heterologous nucleic acid of interest, it is possible to include the nucleic acid 
of interest in the nucleic acid that is being introduced into cells to initiate 
production of the artificial chromosome. Thus, for example, a nucleic acid of 

20 interest could be introduced into a cell along with nucleic acid encoding a 
selectable marker and/or a nucleic acid that targets to a heterochromatic 
region of a chromosome. For example, the nucleic acid of interest can be 
linked to targeting nucleic acid(s). Alternatively, heterologous nucleic acid of 
interest can be introduced into an artificial chromosome at a later time after 

25 the initial generation of the artificial chromosome. 

As used herein, the minichromosome refers to a chromosome derived 
from a multicentric, typically dicentric, chromosome that contains more 
euchromatic than heterochromatic DNA. For purposes herein, the 
minichromosome contains a de novo centromere, preferably a centromere 

30 that replicates in plants, more preferably a plant centromere. 
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As used herein, de novo with reference to a centromere, refers to 
generation of an excess centromere in a chromosome as a result of 
incorporation of a heterologous nucleic acid fragment using the methods 
herein. 

5 As used herein, in vitro assembled artificial chromosomes or synthetic 

chromosomes are artificial chromosomes produced by joining essential 
components of a chromosome in vitro. These components include at least a 
centromere, a telomere and an origin of replication. An in vitro assembled 
artificial chromosome may include one or more megareplicators. In particular 
1 0 embodiments, the megareplicator contains sequences of rDNA, particularly 
plant rDNA. 

As used herein, in vitro assembled plant artificial chromosomes are 
produced by joining components {e.g., the centromere, telomere(s) 
megareplicator and an origin of replication) that function in plants, and 

15 preferably, one or more of which is derived from a plant. In vitro assembled 
artificial chromosomes may contain any amount of heterochromatic and/or 
euchromatic nucleic acid. For example, an in vitro assembled artificial 
chromosome may be substantially all heterochromatin, or may contain 
increasing amounts of euchromatic DNA, such that, for example, it contains 

20 about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater than 
about 90% euchromatic DNA. in vitro assembled artificial chromosomes 
may contain one or more regions of segmentation as described with 
reference to ampfification-based artificial chromosomes. 

As used herein, an artificial chromosome platform refers to an artificial 

25 chromosome that has been engineered to include one or more sites for site 
specific recombination-directed integration. Included within the artificial 
chromosome platforms are ACes, particularly plant ACes, that are so- 
engineered. Any sites, including but not limited to any described herein, that 
are suitable for such integration are contemplated. Among the ACes 

30 contemplated herein are those that are predominantly heterochromatic 
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(formerly referred to as satellite artificial chromosomes (SATACs); see, e.g., 
U.S. Patent Nos. 6,077,697 and 6,025,155 and published International PCT 
application No. WO 97/40183), artificial chromosomes predominantly made 
up of repeating nucleic acid units and that contain substantially equivalent 
5 amounts of euchromatic and heterochromatic DNA or wherein the repeat 
regions of the chromosomes contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. Included among the ACes for 
use in generating platforms are artificial chromosomes that introduce and 
express heterologous nucleic acids in plants as described herein. These 

10 include artificial chromosomes that have a centromere derived from a plant, 
and, also, artificial chromosomes that have centromeres that may be derived 
from other organisms but that function in plants. 

As used herein, recognition sequences are particular sequences of 
nucleotides that a protein, DNA, or RNA molecule, or combinations thereof, 

15 (such as, but not limited to, a restriction endonuclease, a modification 
methylase and a recombinase) recognizes and binds. For example, a 
recognition sequence for Cre recombinase (see, e.g., SEQ ID No. 30) is a 34 
base pair sequence containing two 1 3 base pair inverted repeats (serving as 
the recombinase binding sites) flanking an 8 base pair core and designated 

20 ioxP (see, e.g., Sauer (1994) Current Opinion in Biotechnology 5:521-527). 
Other examples of recognition sequences, include, but are not limited to, 
attB and attP, attR and attL and others (see, e.g., SEQ ID Nos. 32-48), that 
are recognized by the recombinase enzyme Integrase (see, SEQ ID Nos. 49 
and 50) for the nucleotide and encoded amino acid sequences of an 

25 exemplary lambda phage integrase). 

The recombination site designated attB is an approximately 33 base 
pair sequence containing two 9 base pair core-type Int binding sites and a 7 
base pair overlap region; attP (SEQ ID No. 48) is an approximately 240 base 
pair sequence containing core-type Int binding sites and arm-type Int binding 

30 sites as well as sites for auxiliary proteins IHF, FIS, and Xis (see, e.g., Landy 
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(1993) Current Opinion in Biotechnology 3:699-7071 see f e.g., SEQ ID Nos. 
32 and 48). 

As used herein, a recombinase is an enzyme that catalyzes the 
exchange of DNA segments at specific recombination sites. An integrase 
5 herein refers to a recombinase that is a member of the lambda (A) integrase 
family. 

As used herein, recombination proteins include excisive proteins, 
integrative proteins, enzymes, co-factors and associated proteins that are 
involved in recombination reactions using one or more recombination sites 

10 (see, Landy (1993) Current Opinion in Biotechnology 3:699-707). 

As used herein the expression "lox site" means a sequence of 
nucleotides at which the gene product of the ere gene, referred to 
herein as Cre, can catalyze a site-specific recombination event. A LoxP site 
is a 34 base pair nucleotide sequence from bacteriophage P1 (see, e.g., 

15 Hoess etaL (1982) Proc. Natl. Acad. ScL U.S.A. 75:3398-3402). The LoxP 
site contains two 13 base pair inverted repeats separated by an 8 base pair 
spacer region as follows: (SEQ ID NO. 51): 

ATAACTTCGTATA ATGTATGC TATACGAAGTTAT 
E. co//DH5Alac and yeast strain BSY23 transformed with plasmid pBS44 

20 carrying two loxP sites connected with a LEU2 gene are available from the 
American Type Culture Collection (ATCC) under accession numbers ATCC 
53254 and ATCC 20773, respectively. The lox sites can be isolated from 
plasmid pBS44 with restriction enzymes EcoRI and Sail, or Xho\ and BamW\. 
In addition, a preselected DNA segment can be inserted into pBS44 at either 

25 the Sail or BamYW restriction enzyme sites. Other lox sites include, but are 
not limited to, LoxB, LoxL, LoxC2 and LoxR sites, which are nucleotide 
sequences isolated from E. cod (see, e.g., Hoess et al. (1982) Proc. Natl. 
Acad. Set. U.S.A. 7S:3398). Lox sites can also be produced by a variety of 
synthetic techniques (see, e.g., Ito et al. (1982) Nuc. Acid Res. /0;1755 and 

30 Ogilvie et al. (1 981 ) Science 270:270). 
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As used herein, the expression "ere gene" means a sequence of 
nucleotides that encodes a gene product that effects site-specific 
recombination of DNA in eukaryotic cells at lox sites. One ere gene can be 
isolated from bacteriophage P1 (see, e.g., Abremski et al. (1983) Cell 
5 32:1 301-131 1). E. coli DH1 and yeast strain BSY90 transformed with 
plasmid pBS39 carrying a ere gene isolated from bacteriophage P1 and a 
GAL1 regulatory nucleotide sequence are available from the American Type 
Culture Collection (ATCC) under accession numbers ATCC 53255 and ATCC 
20772, respectively. The ere gene can be isolated from plasmid pBS39 with 

10 restriction enzymes Xho\ and Saf\. 

As used herein, site-specific recombination refers to site-specific 
recombination that is effected between two specific sites on a single nucleic 
acid molecule or between two different molecules that requires the presence 
of an exogenous protein, such as an integrase or recombinase. 

15 For example, Cre-lox site-specific recombination can include the 

following three events: 

a. deletion of a pre-selected DNA segment flanked by lox 

sites; 

b. inversion of the nucleotide sequence of a pre-selected 
20 DNA segment flanked by lox sites; and 

c. reciprocal exchange of DNA segments proximate to lox 
sites located on different DNA molecules. 

This reciprocal exchange of DNA segments can result in an integration 
event if one or both of the DNA molecules are circular. DNA segment refers 

25 to a linear fragment of single- or double-stranded deoxyribonucleic acid 
(DNA), which can be derived from any source. Since the lox site is an 
asymmetrical nucleotide sequence, two lox sites on the same DNA molecule 
can have the same or opposite orientations with respect to each other. 
Recombination between lox sites in the same orientation results in a deletion 

30 of the DNA segment located between the two lox sites and a connection 
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between the resulting ends of the original DNA molecule. The deleted DNA 
segment forms a circular molecule of DNA. The original DNA molecule and 
the resulting circular molecule each contain a single lox site. Recombination 
between lox sites in opposite orientations on the same DNA molecule result 
5 in an inversion of the nucleotide sequence of the DNA segment located 
between the two lox sites. In addition, reciprocal exchange of DNA 
segments proximate to lox sites located on two different DNA molecules can 
occur. All of these recombination events are catalyzed by the gene product 
of the ere gene. Thus, the Cre-lox system can be used to specifically delete, 

10 invert, or insert DNA. The precise event is controlled by the orientation of 
lox DNA sequences, in cfs the lox sequences direct the Cre recombinase to 
either delete (lox sequences in direct orientation) or invert (lox sequences in 
inverted orientation) DNA flanked by the sequences, while in trans the lox 
sequences can direct a homologous recombination event resulting in the 

15 insertion of a recombinant DNA. 

As used herein, a plant refers to an organism that is taxonomically 
classifed as being in the kingdom Plantae. Such organisms include 
eukaryotic organisms that contain chloroplasts capable of carrying out 
photosynthesis. A plant can be unicellular or multicellular and can contain 

20 multiple tissues and/or organs. Plants can reproduce sexually and/or 

asexually and include species that are perennial or annual in growth habit. A 
plants can be found to exist in a variety of habitats, including terrestrial and 
aquatic environments. The term "plant" includes a whole plant, plant cell, 
plant protoplast, plant calli, plant seed, plant organ, plant tissue, and other 

25 parts of a whole plant. 

As used herein, reproductive mode with reference to a plant refers to 
any and all methods by which a plant produces progeny. Reproductive 
modes include, but are not limited to, sexual and asexual reproduction. 
Plants may produce progeny by one or multiple reproductive modes. Sexual 

30 reproduction can include union of cells derived from haploid gametophytes 
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(e.g., eggs produced from ovules and sperm produced from pollen in seed 
plants) to form diploid zygotes. Zygotes may be formed from gametophytes 
from different plants or from gametophytes of the same plant {e.g., through 
self-fertilization). Asexual reproduction can occur when offspring are 
5 produced through modifications of the sexual life cycle that do not include 
meiosis and syngamy. For example, when vascular plants reproduce 
asexually, they may do so by vegetative reproduction, such as budding, 
branching, and tillering, or by producing spores or seed genetically identical 
to the sporophytes that produced them. 

10 As used herein, stable maintenance of chromosomes occurs when at 

least about 85%, preferably 90%, more preferably 95%, of the cells retain 
the chromosome. Stability is measured in the presence of a selective agent. 
Preferably these chromosomes are also maintained in the absence of a 
selective agent. Stable chromosomes also retain their structure during cell 

1 5 culturing, suffering no unintended intrachromosomal nor interchromosomal 
rearrangements. 

As used herein, BrdU refers to 5-bromodeoxyuridine f which during 
replication is inserted in place of thymidine. BrdU is used as a mutagen; it 
also inhibits condensation of metaphase chromosomes during cell division. 

20 As used herein, ribosomal RNA (rRNA) is the specialized RNA that 

forms part of the structure of a ribosome and participates in the synthesis of 
proteins. Ribosomal RNA is produced by transcription of genes which, in 
eukaryotic cells, are present in multiple copies. In human cells, the 
approximately 250 copies of rRNA genes (i.e., genes which encode rRNA) 

25 per haploid genome are spread out in clusters on at least five different 

chromosomes (chromosomes 13, 14, 15, 21 and 22). In mouse cells, the 
presence of ribosomal DNA (rDNA, which is DNA containing sequences that 
encode rRNA) has been verified on at least 1 1 pairs out of 20 mouse 
chromosomes (chromosomes 5, 6, 7, 9, 1 1, 12, 15, 16, 17, 18, and 19) 

30 [see e.g., Rowe et aL (1996) Mamm. Genome 7:886-889 and Johnson et af. 
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(1993) Mamm. Genome 4:49-52], In Arabidopsis thaliana the presence of 
rDNA has been verified on chromosomes 2 and 4 (18S, 5.8S, and 25S 
rDNA) and on chromosomes 3,4, and 5 (5S rDNAHsee The Arabidopsis 
Genome Initiative (2000) Nature 405:796-815]. In eukaryotic cells, the 
5 multiple copies of the highly conserved rRNA genes are located in a tandemly 
arranged series of rDNA units, which are generally about 40-45 kb in length 
and contain a transcribed region and a nontranscribed region known as 
spacer (i.e., intergenic spacer) DNA which can vary in length and sequence. 
In the human and mouse, these tandem arrays of rDNA units are located 

10 adjacent to the pericentric satellite DNA sequences (heterochromatin). The 
regions of these chromosomes in which the rDNA is located are referred to 
as nucleolar organizing regions (NOR) which loop into the nucleolus, the site 
of ribosome production within the cell nucleus. In higher plants, the rDNA is 
arragened in long tandem repeating units, similar to those of other higher 

15 eukaroytes. The 18S, 5.8S and 25S rRNA genes are clustered and are 
transcribed as one unit, while the 5S genes are located elsewhere in the 
genome. Between the 3' end of the 25S gene and the 5' end of the 18S 
gene is located a DNA spacer that ranges from 1 kb to greater than 1 2 kb in 
length for different species. Therefore, the rDNA repeat ranges from about 4 

20 kb to about 15 kb for different plant species [see, e.g., Rogers and Bendich 
(1987) Plant MoL BioL 3:509-520]. 

As used herein, a megachromosome refers to a chromosome that, 
except for introduced heterologous DNA, is substantially composed of 
heterochromatin. Megachromosomes are made up of an array of repeated 

25 amplicons that contain two inverted megareplicons bordered by introduced 
heterologous DNA [see, e.g., Figure 3 of U.S. Patent No. 6,077,697 for a 
schematic drawing of a megachromosome]. For purposes herein, a 
megachromosome is about 50 to 400 Mb, generally about 250-400 Mb. 
Shorter variants are also referred to as truncated megachromosomes [about 

30 90 to 120 or 150 Mb], dwarf megachromosomes [-150-200 Mb] and cell 
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lines, and a micro-megachromosome [ — 50-90 Mb, typically 50-60 Mb]. For 
purposes herein, the term megachromosome refers to the overall repeated 
structure based on an array of repeated chromosomal segments (amplicons) 
that contain two inverted megareplicons bordered by any inserted 
5 heterologous DNA. 

As used herein, transformation and transfection are used 
interchangeably to refer to the process of introducing nucleic acid 
introduced into cells. The terms transfection and transformation refer to the 
taking up of exogenous nucleic acid, e.g., an expression vector, by a host 

10 cell whether or not any coding sequences are in fact expressed. Numerous 
methods of introducing nucleic acids into cells are known to the ordinarily 
skilled artisan, for example, by Agrobacterium-rc\ed\ated transformation, 
protoplast transfection (including polyethylene glycol (PEG)-mediated 
transfection, electroporation, protoplast fusion, and microcell fusion), lipid- 

15 mediated delivery, liposomes, electroporation, microinjection, particle 

bombardment and silicon carbide whisker-mediated transformation (see, e.g., 
Paszkowski eta/. (1984) EMBO J. 3:2717-2722; Potrykus et al. (1985) Mol. 
Gen. Genet. 7SS: 169-1 77; Reich et al. (1986) Biotechnology 4:1001-1004; 
Klein et al. (1987) Nature 327:70-73; U.S. Patent No. 6,143,949; 

20 Paszkowski et al. (1989) in Cell Culture and Somatic Ceil Genetics of Plants, 
Vol. 6, Molecular Biology of Plant Nuclear Genes, eds. Schell, J and Vasil, 
L.K. Academic Publishers, San Diego, California, p. 52-68; and Frame et al. 
(1994) Plant J. 6:941-948), direct uptake using calcium phosphate [CaP04; 
see,e.g. t Wigler et al. (1979) Proc. Natl. Acad. Sci. U.S.A. 76:1373-1376], 

25 polyethylene glycol [PEG]-mediated DNA uptake, lipofection [see, e.g., 

Strauss (1996) Meth. Mol. Biol. 54:307-327], microcell fusion [see Lambert 
(1991) Proc. Natl. Acad. Sci. U.S.A. 66:5907-5911; U.S. Patent No. 
5,396,767, Sawford et al. (1987) Somatic Cell Mol. Genet. 73:279-284; 
Dhar et al. (1984) Somatic Cell Mol. Genet. 70:547-559; and McNeill-Killary 

30 et al. (1995) Meth. Enzymol. 254:133-152], lipid-mediated carrier systems 
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[see, e.g., Teifel et al. (1995) Biotechniques 79:79-80; Aibrecht etal. (1996) 
Ann. HematoL 72:73-79; Holmen etal. (1995) In Vitro Cell Dev. Biol. Anim. 
37:347-351; Remy etal. (1994) Bioconjug. Chem. 5:647-654; Le Bolch et 
aL (1995) Tetrahedron Lett. 36:6681 : 6684; Loeff ler et al. (1993) Meth. 
5 Enzymol. 27 7:599-618] or other suitable method. Successful transfection is 
generally recognized by detection of the presence of the heterologous nucleic 
acid within the transfected cell, such as, for example, any visualization of the 
heterologous nucleic acid or any indication of the operation of a vector within 
the host cell. 

10 As used herein, injected refers to the microinjection (use of a small 

syringe, needle, or pipette) of nucleic acid into a celL 

As used herein, gene therapy involves the transfer or insertion of 
nucleic acid molecules into certain cells, which are also referred to as target 
cells, to produce products that are involved in preventing, curing, correcting, 

15 controlling or modulating diseases, disorders and/or deleterious conditions. 
The nucleic acid is introduced into the selected target cells in a manner such 
that the nucleic acid is expressed and a product encoded thereby is 
produced. Alternatively, the nucleic acid may in some manner mediate 
expression of DNA that encodes a therapeutic product. This product may be 

20 a therapeutic compound, which is produced in therapeutically effective 

amounts or at a therapeutically useful time. It may also encode a product, 
such as a peptide or RNA, that in some manner mediates, directly or 
indirectly, expression of a therapeutic product. Expression of the nucleic 
acid by the target cells within an organism afflicted with a disease or 

25 disorder thereby enables modulation of the disease or disorder. The nucleic 
acid encoding the therapeutic product may be modified prior to introduction 
into the ceils of the afflicted host in order to enhance or otherwise alter the 
product or expression thereof. 

For use in gene therapy, cells can be transfected in vitro, followed by 

30 introduction of the transfected cells into an organism. This is often referred 
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to as ex vivo gene therapy. Alternatively, the cells can be transfected 
directly in vivo within an organism. 

As used herein, a therapeutically effective product is a product that 
effectively ameliorates or eliminates the symptoms or manifestations of an 
5 inherited or acquired disease or disorder or that cures said disease or disorder 
in an organism. For example, therapeutically effective products include a 
product that is encoded by heterologous DNA expressed in a diseased 
organism and a product produced from heterologous DNA in a host cell and 
to which a diseased organism is exposed. 

10 As used herein, a transgenic plant refers to a plant (e.g., a plant cell, 

tissue, organ or whole plant) containing heterologous or foreign nucleic acid 
or in which the expression of a gene naturally present in the plant has been 
altered. Heterologous nucleic acid within a transgenic plant may be 
transiently or stably maintained within the plant. Stable maintenance of 

15 heterologous nucleic acid may be maintenance of the nucleic acid through 
one or more, or two or more, or five or more, or ten or more, or 25 or more, 
or 50 or more or 60 or more cell divisions. A transgenic plant may contain 
heterologous nucleic acid in one cell, multiple cells or all cells. A transgenic 
plant may produce progeny that contain or do not contain the heterologous 

20 nucleic acid. 

As used herein, a promoter, with respect to a region of DNA, refers to 
a sequence of DNA that contains a sequence of bases that signals RNA 
polymerase to associate with the DNA and initiate transcription of messenger 
RNA (mRNA) from a template strand of the DNA. A promoter thus generally 

25 regulates transcription of DNA into mRNA. 

As used herein, operative linkage of heterologous DNA to regulatory 
and effector sequences of nucleotides, such as promoters, enhancers, 
transcriptional and translational stop sites, and other signal sequences refers 
to the relationship between such DNA and such sequences of nucleotides. 

30 For example, operative linkage of heterologous DNA to a promoter refers to 
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the physical relationship between the DNA and the promoter such that the 
transcription of such DNA is initiated from the promoter by an RNA 
polymerase that specifically recognizes, binds to and transcribes the DNA in 
reading frame. 

5 As used herein, isolated, substantially pure nucleic acid, such as, for 

example, DNA, refers to nucleic acid fragments purified according to 
standard techniques employed by those skilled in the art, such as that found 
in Maniatis eta/. [(1982) Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, NY]. 

10 As used herein, expression refers to the transcription and/or 

translation of nucleic acid. For example, expression can be the transcription 
of a gene into an RNA molecule, such as a messenger RNA (mRNA) 
molecule. Expression may further include translation of an RNA molecule 
into peptides, polypeptides, or proteins. If the nucleic acid is derived from 

15 genomic DNA, expression may, if an appropriate eukaryotic host cell or 
organism is selected, include splicing of the mRNA. With respect to an 
antisense construct, expression may refer to the transcription of the 
antisense DNA. 

As used herein, vector or plasmid refers to discrete elements that are 
20 used to introduce heterologous nucleic acids into cells for either expression 
of the heterologous nucleic acid or for replication of the heterologous nucleic 
acid. Selection and use of such vectors and plasmids are well within the 
level of skill of the art. 

As used herein, substantially homologous DNA refers to DNA that 
25 includes a sequence of nucleotides that is sufficiently similar to another such 
sequence to form stable hybrids under specified conditions. 

It is well known to those of skill in this art that nucleic acid fragments 
with different sequences may, under the same conditions, hybridize 
detectably to the same "target" nucleic acid. Two nucleic acid fragments 
30 hybridize detectably, under stringent conditions over a sufficiently long 
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hybridization period, because one fragment contains a segment of at least 
about 14 nucleotides in a sequence which is complementary (or nearly 
complementary) to the sequence of at least one segment in the other nucleic 
acid fragment. If the time during which hybridization is allowed to occur is 
5 held constant, at a value during which, under preselected stringency 

conditions, two nucleic acid fragments with exactly complementary base- 
pairing segments hybridize detectably to each other, departures from exact 
complementarity can be introduced into the base-pairing segments, and base- 
pairing will nonetheless occur to an extent sufficient to make hybridization 

10 detectable. As the departure from complementarity between the base-pairing 
segments of two nucleic acids becomes larger, and as conditions of the 
hybridization become more stringent, the probability decreases that the two 
segments will hybridize detectably to each other. 

Two single-stranded nucleic acid segments have "substantially the 

15 same sequence," within the meaning of the present specification, if (a) both 
form a base-paired duplex with the same segment, and (b) the melting 
temperatures of said two duplexes in a solution of 0.5 X SSPE differ by less 
than 10oC. If the segments being compared have the same number of 
bases, then to have "substantially the same sequence", they will typically 

20 differ in their sequences at fewer than 1 base in 10. Methods for determining 
melting temperatures of nucleic acid duplexes are well known [see, e.g., 
Meinkoth and Wahl (1984) Anal. Biochem . 138 :267-284 and references 
cited therein]. 

As used herein, a nucleic acid probe is a DNA or RNA fragment that 
25 includes a sufficient number of nucleotides to specifically hybridize to DNA or 
RNA that includes identical or closely related sequences of nucleotides. A 
probe may contain any number of nucleotides, from as few as about 10 and 
as many as hundreds of thousands of nucleotides. The conditions and 
protocols for such hybridization reactions are well known to those of skill in 
30 the art as are the effects of probe size, temperature, degree of mismatch. 
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salt concentration and other parameters on the hybridization reaction. For 
example, the lower the temperature and higher the salt concentration at 
which the hybridization reaction is carried out, the greater the degree of 
mismatch that may be present in the hybrid molecules. 
5 To be used as a hybridization probe, the nucleic acid is generally 

rendered detectable by labelling it with a detectable moiety or label, such as 
32 P, 3 H and 14 C, or by other means, including chemical labelling, such as by 
nick-translation in the presence of deoxyuridylate biotinylated at the 5'- 
position of the uracil moiety. The resulting probe includes the biotinylated 

10 uridylate in place of thymidylate residues and can be detected (via the biotin 
moieties) by any of a number of commercially available detection systems 
based on binding of streptavidin to the biotin. Such commercially available 
detection systems can be obtained, for example, from Enzo Biochemicals, 
Inc. (New York, NY). Any other label known to those of skill in the art, 

15 including non-radioactive labels, may be used as long as it renders the probes 
sufficiently detectable, which is a function of the sensitivity of the assay, the 
time available (for culturing cells, extracting DNA, and hybridization assays), 
the quantity of DNA or RNA available as a source of the probe, the particular 
label and the means used to detect the label. 

20 Once sequences with a sufficiently high degree of homology to the 

probe are identified, they can readily be isolated by standard techniques, 
which are described, for example, by Maniatis et al. [(1982) Molecular 
Cloning: A Laboratory Manual, Cojd Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY], 

25 As used herein, conditions under which DNA molecules form stable 

hybrids and are considered substantially homologous are such that DNA 
molecules with at least about 60% complementarity form stable hybrids. 
Such DNA fragments are herein considered to be "substantially 
homologous". For example, DNA that encodes a particular protein is 

30 substantially homologous to another DNA fragment if the DNA forms stable 
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hybrids such that the sequences of the fragments are at least about 60% 
complementary and if a protein encoded by the DNA retains its activity. 

For purposes herein, the following stringency conditions are defined: 
1) high stringency: 0.1 x SSPE, 0.1% SDS, 65°C 
5 2) medium stringency: 0.2 x SSPE, 0.1% SDS, 50°C 

3) low stringency: 1 .0 x SSPE, 0.1 % SDS, 50°C 
or any combination of salt and temperature and other reagents that result in 
selection of the same degree of mismatch or matching. 

As used herein, all assays and procedures, such as hybridization 
10 reactions and antibody-antigen reactions, unless otherwise specified, are 
conducted under conditions recognized by those of skill in the art as 
standard conditions. 

A. Amplification of Chromosomal Segments and Use Thereof in the 
Generation of Artificial Chromosomes 

15 The methods, cells and artificial chromosomes provided herein are 

produced by virtue of the discovery of the existence of a higher-order 
replication unit (megareplicon) of the centromer/c region, including the 
pericentric DNA, of a chromosome. This megareplicon is delimited by a 
primary replication initiation site (megareplicator), and appears to facilitate 

20 replication of the centromeric heterochromatin, and, most likely, 

centromeres. Integration of heterologous nucleic acid into the megareplicator 
region, or in close proximity thereto, initiates a large-scale amplification of 
megabase-size chromosomal segments. Products of such amplification may 
be used as artificial chromosomes or in the generation of artificial 

25 chromosomes as described herein. 

Included among the DNA sequences that may provide a 
megareplicator are the rDNA units that give rise to ribosomal RNA (rRNA). In 
plants and animals, particularly mammals such as mice and humans, these 
rDNA units can contain specialized elements, such as the origin of replication 

30 (or origin of bidirectional replication, I.e., OBR, in mouse) and amplification 
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promoting sequences (APS) and amplification control elements (ACE) [see, 
e.g., with respect to plant rDNA, U.S. Patent Nos. 6,096,546 (to Raskin) and 
6,100,092 (to Borysyuk et af.); PCT International Application Publication No. 
WO99/66058; Genbank Accession no. Y08422 (containing the central AT- 
5 rich region of a tobacco rDNA intergenic spacer); Borysyuk et al. (1997) 
Plant MoL Biol. 35:655-660); Borysyuk et a/.. (2000) Nature Biotechnology 
73:1303-1306; Hernandez et aL (1993) EMBO J. 72:1475-1485; Van't Hof 
and Lamm (1992) Plant MoL Biol. 20:377-382; Hernandez et al. (1988) Plant 
Mol. Biol. 70:413-322; and with respect to mammalian rDNA, Gogel et aL 

10 (1996) Chromosoma 704:511-518; Coffmanef a/. (1993) Exp. Cell. Res. 
209:123-132; Little et al. (1993) MoL Cell. BioL 73:6600-6613; Yoon et al. 
(1995) MoL Cell. Biol. 75:2482-2489; Gonzalez and Sylvester (1995) 
Genomics 27:320-328; Miesfeld and Arnheim (1982) Nuc. Acids Res. 
70:3933-3949; Maden et al. (1987) Biochem. J. 246:519-527]. 

15 As described herein, without being bound by any theory, specialized 

elements such as these may facilitate replication and/or amplification of 
megabase-size chromosomal segments in the de novo formation of 
chromosomes, such as the artificial chromosomes described herein, in cells. 
These specialized elements are typically located in the nontranscribed 

20 intergenic spacer region upstream of the transcribed region of rDNA. The 
intergenic spacer region may itself contain internally repeated sequences 
which can be classified as tandemly repeated blocks and nontandem blocks 
(see e.g., Gonzalez and Sylvester (1995) Genomics 27:320-328). In mouse 
rDNA, an origin of bidirectional replication may be found within a 3-kb 

25 initiation zone centered approximately 1 .6 kb upstream of the transcription 
start site (see, e.g., Gogel et al. (1996) Chromosoma 704:511-518). The 
sequences of these specialized elements tend to have an altered chromatin 
structure, which may be detected, for example, by nuclease hypersensitivity 
or the presence of AT-rich regions that can give rise to bent DNA structures. 
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Sequences of intergenic spacer regions of plant rDNA include, but are 
not limited to, sequences contained in GenBank Accession numbers S70723 
(from the 5S rDNA of barley (Hordeum vulgare)), AF013103 and X03989 
(from maize (Zea mays)), X65489 (from potato (So/anum tuberosum)), 
5 X52265 (from tomato (Lycopersicon esculentum)) , AF177418 (from 

Arabidopsis neglecta), AF 177421 and AF 17422 (from Arabidopsis haiieri), 
A71562, X15550, and X52631 (from Arabidopsis thai/ana; see Gruendler et 
al. (1991) J. MoL Biol. 227:1209-1222 and Gruendler eta/. (1989) Nucleic 
Acids Res. 77:6395-6396), X54194 (from rice (Oryza sativa)) and Y08422 

10 and D76443 (from tobacco (Nicotiana tabacum). Sequences of intergenic 
spacer regions of plant rDNA further include sequences from rye (see Appels 
etaL (1986) Can. J. Genet. Cytol. 23:673-685), wheat (see Barker et al. 
(1988) J. MoL Biol. 207:1-17 and Sardana and Flavell (1996) Genome 
53:288-292), radish (see Delcasso-Tremousaygue et al. (1988) Eur. J. 

15 Biochem. 172.767-776), Vicia faba and Pisum sativum (see Kato et al. 

(1990) Plant MoL Biol. 74:983-993), mung bean (see Gerstner et al. (1988) 
Genome 30:723-733; and Schiebel etaL (1989) MoL Gen. Genet. 2 75:302- 
307), tomato (see Schmidt-Puchta etaL (1989) Plant MoL BioL 73:251- 
253), Hordeum bulbosum (see Procunier et al. (1990) Plant MoL BioL 

20 75:661-663) and Lens culinaris Medik., and other legume species (see 
Fernandez etaL (2000) Genome 43:597-603). Nucleic acids containing 
intergenic spacer sequences from plants can be obtained by nucleic acid 
amplification of DNA from p)ant cells using oligonucleotide primers 
corresponding to the 3' end of the conserved 25S mature rRNA encoding 

25 region and the 5' end of the conserved 1 8S mature rRNA encoding region 
{see e.g., PCT Application Publication No. W098/13505). 

An exemplary sequence encompassing a mammalian origin of 
replication is provided in GENBANK accession no. X82564 at about positions 
2430-5435. Exemplary sequences encompassing mammalian amplification- 

30 promoting sequences include nucleotides 690-1060 and 1 105-1530 of 
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GENBANK accession no. X82564 and are also provided in PCT Application 
Publication No. WO 97/40183. Exemplary sequences encompassing plant 
amplification-promoting sequences (APS) include those provided in U.S. 
Patent No. 6,100,092. 
5 In human rDNA, a primary replication initiation site may be found a 

few kilobase pairs upstream of the transcribed region and secondary initiation 
sites may be found throughout the nontranscribed intergenic spacer region 
(see, e.g., Yoon et al. (1995) Mol. Cell. Biol. 75:2482-2489). A complete 
human rDNA repeat unit is presented in GENBANK as accession no. U 13369. 

10 Another exemplary sequence encompassing a replication initiation site may 
be found within the sequence of nucleotides 35355-42486 in GENBANK 
accession no. U 13369 particularly within the sequence of nucleotides 
37912-42486 and more particularly within the sequence of nucleotides 
37912-39288 of GENBANK accession no. U 13369 (see Coffman et al. 

15 (1993) Exp. Cell. Res. 205:123-132). 

B. Preparation of Plant Artificial Chromosomes 

Cell lines containing artificial chromosomes can be prepared by 
transforming cells, preferably a stable cell line, with heterologous nucleic acid 
and identifying cells that contain an artificial chromosome as described 

20 herein. The artificial chromosome is a chromosomal structure that is distinct 
from any chromosome that existed in the cell prior to introduction of the 
heterologous nucleic acid. A cell containing an artificial chromosome may be 
identified using a variety of procedures, alone or in combination, as described 
in detail herein. In particular embodiments of the methods described herein, 

25 the heterologous nucleic acid contains a sequence that targets the nucleic 
acid to an amplifiable region of a chromosome in the cell, such as, for 
example, the pericentric heterochromatin and/or rDNA. A variety of targeting 
sequences are provided herein. 

Prior to analyzing transformed cells for the presence of an artificial 

30 chromosome, the cells to be analyzed may be enriched with artificial 
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chromosome-containing cells using a variety of techniques depending on the 
heterologous nucleic acid that was introduced into the host cell to initiate 
generation of the artificial chromosomes. For example, if nucleic acid 
encoding a selectable marker was included in the heterologous nucleic acid, 
5 ceils containing the marker may be selected for analysis. If the selectable 
marker is one that confers resistance to a cytotoxic agent, e.g., bialaphos, 
hygromycin or kanamycin, the transformed cells may be cultured under 
selective conditions which include the agent. Cells surviving growth under 
selective conditions are then analyzed for the presence of artificial 

10 chromosomes. If the selectable marker is a readily detectable reporter 

molecule, such as, for example, a fluorescent protein, the transformed cells 
may be selected on the basis of fluorescent properties. For example, cells 
containing the fluorescent protein may be isolated from nontransformed cells 
using a fluorescence-activated cell sorter (FACS). 

15 In analyzing transformed cells for the presence of artificial 

chromosomes, it is also possible to identify cells that have a multicentric, 
typically dicentric, chromosome, formerly multicentric (typically dicentric) 
chromosome, minichromosome and/or heterochromatic structures, such as a 
megachromosome and a sausage chromosome. If cells containing 

20 multicentric chromosomes or formerly mulitcentric (typically formerly 
dicentric) chromosomes are initially selected, these cells can then be 
manipulated, if need be, as described herein to produce the 
minichromosomes and other artificial chromosomes, particularly the 
heterochromatic artificial chromosomes and other segmented, repeat region- 

25 containing artificial chromosomes, as described herein. 

1 . Cells used in the generation of plant artificial chromosomes 

Any cells harboring plant centromere-containing chromosomes may be 
used in the generation of plant artificial chromosomes (PACs). Such cells 
30 include, but are not limited to, plant cells, protoplasts, and cells that are 
hybrid cells of one or more plant species. Preferred cells are those that 
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harbor plant centromere-containing chromosomes and are readily susceptible 
to the introduction of heterologous nucleic acids therein. 

Cells for use in the generation of plant artificial chromosomes include 
cells that harbor acrocentric plant chromosomes. Examples of acrocentric 
5 plant chromosomes include chromosomes 2 and 4 of the plant Arabidopsis 
thaliana (see, e.g., Mayer et aL (1999) Nature 402:769-777; Murata eta/. 
(1997) The Plant Journal 72:31-37; The Arabidopsis Genome Initiative 
(2000) Nature 405:796-815), four acrocentric chromosome pairs in 
Helianthus annuus (sunflower; see Schrader et aL (1997) Chromosome Res. 

10 5:451-456), two pairs of acrocentric chromosomes in domesticated pepper 
plant {Capsicum annuum) and a nearly acrocentric chromosome in lentil 
plant. In particular embodiments of the methods described herein, cells 
harboring acrocentric plant chromosomes containing rDNA are used in 
generating plant artificial chromosomes. 

15 Plant species from which cells may be obtained include, but are not 

limited to, vegetable crops, fruit and vine crops, field plants, bedding plants, 
trees, shrubs, and other nursery stock. Examples of vegetable crops include 
artichokes, kohlrabi, arugula, leeks, asparagus, lettuce, bok choy, malanga, 
broccoli, melons (e.g., muskmelon, watermelon, crenshaw, honeydew, 

20 cantaloupe), brussel sprouts, cabbage, cardoni, carots, napa, cauliflower, 

okra, onions, celery, parsley, chick peas, parsnips, chicory, Chinese cabbage, 
peppers, collards, potatoes, cucumber plants, pumpkins, cucurbits, radishes, 
dry bulb onions, rutabaga, eggplant, salsify, escarole, shallots, endive, garlic, 
spinach, green onions, squash, greens, beet, sweet potatoes, swiss chard, 

25 horseradish, tomatoes, kale, turnips and spices. Fruit and vine crops include 
apples, apricots, cherries, nectarines, peaches, pears, plums, prunes, quince, 
almonds, chestnuts, filberts, pecans, pistachios, walnuts, citrus, blueberries, 
boysenberries, cranberries, currants, loganberries, raspberries, strawberries, 
blackberries, grapes, avocados, bananas, kiwi, persimmons, pomegrante, 

30 pineapple, tropical fruits, pomes, melon, mango, papaya and lychee. 
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Field crop plants include evening primrose, meadow foam, corn, 

maize, hops, jojoba, peanuts, rice, safflower, small grains (barley, oats, rye, 

wheat, and others) sorghum, tobacco, kapok, leguminous plants (beans, 

lentils, peas, soybeans), oil plants (canola, rape, mustard, poppy, olives, 

5 sunflowers, coconut, castor oil plants, cocoa beans, groundnuts), fibre plants 

(cotton, flax, hemp, jute), lauraceae (cinnamon, camphor) and plants such as 

coffee, sugarcane, tea and natural rubber plants. Other examples of plants 

include bedding plants such as flowers, cactus, succulents and ornamental 

plants, as well as trees such as forest (broad-leaved trees and evergreens, 

10 such as conifers), fruit, ornamental and nut-bearing trees, shrubs, algae, 

moss, and duckweed. 

2. Heterologous nucleic acids for use in generating plant artificial 
chromosomes 

a. Selectable markers 

15 The heterologous nucleic acid that is introduced into a cell in the 

generation of artificial chromosomes as described herein may include nucleic 
acid encoding a selectable marker. Any nucleic acid that includes a 
selectable marker sequence may be introduced into cells harboring plant 
centromere-containing chromosomes for the generation of plant artificial 

20 chromosomes. Examples of selectable markers include, but are not limited 
to, DNA encoding a product that confers resistance to a cytotoxic or 
cytostatic agent and DNA encoding a readily detectable product, such as a 
reporter protein. 

(1) Nucleic acids encoding products that confer 
25 resistance to a selection agent 

Examples of selectable markers include the dihydrylfolate reductase 

(dhfr) gene, hygromycin phosphotransferase genes, the phosphinothricin 

acetyl transferase gene (bar gene) and neomycin phosphotransferase genes. 

Selectable markers that can be used in animal, e.g., mammalian cells include, 

30 but are not limited to the thymidine kinase gene and the cellular adenine- 

phosphribosyltransferase gene. 
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Of particular interest for purposes herein are nucleic acid selectable 
markers that, upon expression in the host cell, confer antibiotic or herbicide 
resistance to the cell, sufficient to provide for the maintenance of 
heterologous nucleic acids in the cell, and which facilitate the transfer of 
5 artificial chromosomes containing the marker DNA into new host cells. 
Examples of such markers include DNA encoding products that confer 
cellular resistance to hygromycin, kanamycin, G41 8, bialaphos, Basta, 
methotrexate, glyphosate, and puromycin. For example, neo (or nptll) 
provides kanamycin resistance and can be selected for using kanamycin, 

10 G418, paromomycin and other agents [see, e.g., Messing and Vierra (1982) 
Gene 7S:259-268; and Bevan eta/. (1983) Nature 304:184-187]; bar from 
Steptomyces hygroscopicus, which encodes the enzyme phosphinothricin 
acetyl transferase (PAT) confers bialaphos, glufosinate, Basta or 
phosphinothricin resistance [see e.g., White et aL (1990) Nuc. Acids Res. 

15 75:1062; Spencer et aL (1990) Theor. Appl. Genet. 75:625-631; Vickers et 
aL (1996) Plant MoL Biol. Reporter 74:363-368; and Thompson et al. (1987) 
EMBO J. 6:2519-2523]; the hph gene which confers resistance to the 
antibiotic hygromycin (see, e.g., Blochinger and Diggelmann, MoL Cell. Biol. 
4:2929-2931); a mutant EPSP synthase protein [see Hinchee et aL (1988) 

20 Bio/technol 6:915-922] confers glyphosate resistance (see also U.S. Patent 
Nos. 4,940,935 and 5,188,642); and a nitrilase such as bxn from Klebsiella 
ozaenae confers resistance to bromoxynil [see Stalker et aL (1988) Science 
242:419-42], DNA encoding cystathionine gamma-synthase (CGS) can be 
used as a marker that confers resistance to ethionine (see PCT Application 

25 Publication No. WO 00/55303). Examples of markers that can be used in 
animal, e.g., mammalian cells, include but are not limited to DNA encoding 
products that confer cellular resistance to streptomycin, zeocin, 
chloramphenicol and tetracycline. 

(2) Reporter Molecules 



WO 02/096923 



PCT/US02/17451 



-57- 

Nucleic acids encoding reporter molecules may also be included in the 
nucleic acid that is introduced into a recipient cell in the generation of 
artificial chromosomes. Reporter genes provide a means for identifying cells 
and chromosomes into which heterologous nucleic acids have been 
5 transferred and further provide a means for assessing whether or not, and to 
what extent, transferred DNA is expressed. 

Nucleic acids encoding reporter molecules that may be used in 
monitoring transfer and expression of heterologous nucleic acids into cells, 
particularly plant cells include, but are not limited to, nucleic acid encoding /?- 

10 glucuronidase (GUS) or the uidA gene product, which is an enzyme for which 
various chromogenic substrates are known [see Novel and Novel (1973) Mol. 
Gen. Genet. 720:319-335; Jefferson eta/. (1986) Proc. Natl. Acad. ScL 
USA £3:8447-8451; US Patent No. 5,268,463; commercialfy available from 
Clontech Laboratories, Palo Alto, CA], DNA from an R-locus gene, which 

15 encodes a product that regulates the production of anthocyanin pigments 
(red color) in plant tissues [see, e.g., Dellaporta et al. (1988) In 
"Chromosome Structure and Function: impact of New Concepts, 18th 
Stadler Genetics Sympsium" 7 7:263-282], nucleic acid encoding /^-lactamase 
[Sutcliffe (1978) Proc. Nati. Acad. ScL U.S.A. 75:3737-3741] which is an 

20 enzyme for which various chromogenic substrates are known (e.g., PADAC, 
a chromogenic cephalosporin), DNA from a xy/E gene [see, e.g., Zukowsky 
eta/. (1983) Proc. Natl. Acad. ScL U.S.A. 30:1101-1105], which encodes a 
catechol dioxygenase that can convert chromogenic catechols; nucleic acid 
encoding a-amylase [see, e.g., Ikuta et al. (1990) Bio/techno/. 5:241-242], 

25 nucleic acid encoding tyrosinase [see, e.g., Katz et al. (1983) J. Gen. 
Microbiol. 1 29:21 '03-27 '1 4], an enzyme capable of oxidizing tyrosine to 
DOPA and dopaquinone which in turn condenses to form the readily 
detectable compound melanin, nuc\e)c acid encoding fi-galactosidase, an 
enzyme for which there are chromogenic substrates, nucleic acid encoding 

30 luciferase (lux) gene [see, e.g., Ow et al. (1986) Science 234:856-859] 
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which allows for bioluminesence detection, nucleic acid encoding aequorin 
[see, e.g., Prasher et al. (1985) Biochem. Biophy. Res. Commun. 726:1259- 
1 268J which may be employed in calcium-sensitive bioluminescence 
detection, nucleic acid encoding a green fluorescent protein (GFP) [see, e.g., 
5 Sheen et al. (1995) Plant J. 5:777-784; Haselhoff et aL (1997) Proc. Natl. 
Acad. Sci. U.S.A. 94:21 22-21 27; Hasseloff and Amos (1995) Trends Genet 
7 7:328-329; Reichel eta/. (1996) Proc. Natl. Acad. ScL U.S.A. 33:5888- 
5893; Tian et al. (1997) Plant Cell Rep. 76:267-271; Prasher et al. (1992) 
Gene 7 7 7:229-233; Chalfie et al. (1994) Science 263:802; PCT Application 

10 Publication Nos. W097/41228 and WO 95/07463; and commercially 

available from Clontech Laboratoreis, Palo Alto, CA), nucleic acid encoding a 
red or blue fluorescent protein (RFP or BFP, respectively), or nucleic acid 
encoding chloramphenicol acetyltransf erase (CAT). 

Enhanced GFP (EGFP) is a mutant of GFP with a 35-fold increase in 

15 fluorescence. This variant has mutations of Ser to Thr at amino acid 65 and 
Phe to Leu at position 64 and is encoded by a gene with optimized human 
codons (see, e.g., U.S. Patent No. 6,054,312). EGFP is a red-shifted variant 
of wild-type GFP (Yang et al. (1996) Nucl. Acids Res. 24:4592-4593; Haas 
etal. (1996) Curr. Biol. 6:315-324; Jackson et al. (1990) Trends Biochem. 

20 75:477-483) that has been optimized for brighter fluorescence and higher 
expression in mammalian cells (excitation maximum = 488 nm; emission 
maximum = 507 nm). EGFP encodes the GFPmutl variant (Jackson (1990) 
Trends Biochem. 75:477-483) which contains the double-amino-acid 
substitution of Phe-64 to Leu and Ser-65 to Thr. Sequences flanking EGFP 

25 have been converted to a Kozak consensus translation initiation site (Huang 
etal. (1990) Nucleic Acids Res. 18: 937-947) to further increase the 
translation efficiency in eukaryotic cells. 

Nucleic acid from the maize R gene complex can also be used as 
nucleic acid encoding a reporter molecule. The R gene complex in maize 

30 encodes a protein that acts to regulate the production of anthocyanin 
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pigrnents in most seed and plant tissue. Maize strains can have one, or as 

many as four, R alleles which combine to regulate pigmentation in a 

developmental and tissue-specific manner. Thus, an R gene introduced into 

such cells will cause the expression of a red pigment and, if stably 

5 incorporated, can be visually scored as a red sector. If a maize line carries 

dominant alleles for genes encoding for the enzymatic intermediates in the 

anthocyanin biosynthetic pathway (C2, A1, A2, Bz1 and Bz2), but carries a 

recessive allele at the R locus, the transformation of any cell from that line 

with R will result in red pigment formation. Exemplary lines include 

10 Wisconsin 22 which contains the rg-Stadler allele and TR1 12, a K55 

derivative which is r-g, b, PI. Alternatively, any genotype of maize can be 

utilized if the C1 and R alleles are introduced together. 

b. Promoters and other sequences that influence gene 
expression 

1 5 Expression of nucleic acid encoding a selectable marker (or any 

heterologous nucleic acid) in a recipient cell can be regulated by a variety of 
promoters. Promoters for use in regulating transcription of DNA in cells, 
particularly plant cells, include, but are not limited to, the nopaline synthase 
(NOS) and octopine synthase (OCS) promoters; cauliflower mosaic virus 

20 (CaMV) 19S and 35S promoters, the light-inducible promoter from the small 
subunit of ribulose bis-phosphate carboxylase (ssRUBISCO, an abundant 
plant polypeptide), the mannopine synthase (MAS) promoter [see, e.g., 
Velten et aL (1984) EMBO J. 3:2723-2730; and Velten and Schell (1985) 
Nuc. Acids Res. 73:6981-6998], the rice actin promoter, the ubiquitin 

25 promoter, for example, from Z. mays (see e.g., PCT Application Publication 
No. WOOO/60061), Arabidopsis thaiiana UBI 3 promoter [see e.g., Norris et 
aL (1993) Plant MoL Biol. 22:895-906] and the chemically inducible PR-1 
promoter from tobacco or Arabidopsis (see e.g., U.S. Patent No. 5,689,044). 
Selection of a suitable promoter may include several considerations, 

30 for example, recipient cell type (such as, for example, leaf epidermal cells, 
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mesophyll cells, root cortex cells), tissue- or organ-specific [e.g., roots, 
leaves or flowers) expression of genes linked to the promoter, and timing and 
level of expression (as may be influenced by constitutive vs. regulatable 
promoters and promoter strength). 
5 Additional sequences that may also be included in the nucleic acid 

containing a selectable marker include, but are not restricted to, transcription 
terminators and extraneous sequences to enhance expression such as 
introns. A variety of transcription terminators may be used which are 
responsible for termination of transcription beyond a coding region and 

10 correct polyadenylation. Appropriate transcription terminators include those 
that are known to function in plants such as, for example, the CaMV 35S 
terminator, the tmf terminator, the nopaline synthase terminator and the pea 
rbcS E9 terminator, all of which may be used in both monocotyledonous and 
dicotyledonous plants. 

1 5 Numerous sequences have been found to enhance gene expression 

from within the transcriptional unit and these sequences can be used in 
conjunction with selectable marker and other genes to increase expression of 
the genes in plant cells. For example, various intron sequences such as 
introns of the maize Adhl gene have been shown to enhance expression, 

20 particularly in monocotyledonous cells. In addition, a number of non- 
translated leader sequences derived from viruses are also known to enhance 
exprssion, and these are particularly effective in dicotyledonous cells. 

c. Nucleic acids containing targeting sequences 
Development of a multicentric, particularly dicentric, chromosome 

25 typically is effected through integration of heterologous nucleic acid into 

heterochromatin, such as the pericentric heterochromatin, near or within the 
centromeric regions of chromosomes and/or into rDNA sequences. Thus, the 
development of artificial chromosomes may be facilitated by targeting the 
heterologous nucleic acid for integration into these regions, such as by 

30 introducing DNA, including, but not limited to, rDNA (e.g., rDNA intergenic 
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spacer sequence), satellite DNA, pericentric DNA and lambda phage DNA, 
into the recipient host cell. The targeting sequence may be introduced alone 
or with other nucleic acids, including but not limited to selectable markers. 
For example, a targeting sequence can be linked to a selectable marker. 
5 Examples of plant pericentric DNA and satellite DNA include, but are 

not limited to, pericentromeric sequences on tomato chromosome 6 [see, 
e.g., Weide et al. (1998) MoL Gen. Genet. 255:190-197], satellite DNA of 
soybean [see, e.g., Morgante et al. (1997) Chromosome Res. 5:363-373; 
and Vahedian eta/. (1995) Plant MoL Biol. 25:857-862], pericentromeric 

10 DNA of Arabidopsis thaliana [see, e.g., Tutois et al. (1999) Chromosome 
Res. 7:14-3-156], satellite DNA of arabidopsis thaliana (GenBank accession 
nos. AB033593 and X58104), pericentric DNA of the chickpea [Cicer 
arietinum L.; see e.g., Staginnus et al. (1999) Plant Mol. Biol. 39:1037- 
1050], satellite DNA on the rye B chromosome [see, e.g., Langdon et al. 

15 (2000) Genetics 754:869-884], subtelomeric satellite DNA from Silene 
fatifoiia [see, e.g., Garrido-Ramos et al. (1999) Genome 42:442-446] and 
satellite DNA in the Saccharum complex [see, e.g., Alix et al. (1998) 
Genome 47:854-864], 

Examples of rDNA targeting sequences include nucleic acids from 

20 plant and animal rDNA. Plant rDNA sequences include, but are not limited 
to, sequences contained in GENBANK Accession numbers D16103 [from 
rDNA of carrot (Daucus carota)), M23642 and M1 1585 [from rDNA encoding 
24S rRNA of rice (Oryza sat/Va)], M26461 [from from rDNA encoding 18S 
rRNA of rice (Oryza sativa)], M16845 [from rDNA encoding 17S, 5.8S and 

25 25S rRNA of rice {Oryza sativa)], X82780 and X82781 [from rDNA encoding 
5S rRNA of potato (Solanum tuberosum)], AJ131161, AJ131162, 
AJ131163, AJ131164, AJ131165, AJ131166 and AJ131167 [from rDNA 
encoding 5S rRNA of tobacco (Nicotiana tabacum\, L36494 and U31016 
through U31030 [from rDNA encoding 5S rRNA of barley (Hordeum 

30 spontaneum)}, U31004 through U31015 and U31031 [from rDNA encoding 
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5S rRNA of barley (Hordeum bulbosum)], Z1 1759 [from rDNA encoding 5.8S 
rRNA of barley (Hordeum vufgare)], X16077 (from rDNA encoding 18S rRNA 
of Arabidopsis thaiiana), M65137 (rDNA encoding 5S rRNA of Arabidopsis 
thai/ana), AJ232900 (from rDNA encoding 5.8S rRNA of Arabidopsis 
5 thaiiana) and X52320 (from Arabidopsis thaiiana genes for 5.8S and 25S 
rRNA with an 18S rRNA fragment). 

Intergenic spacer regions of plant rDNA include, but are not limited to 
sequences contained in GENBANK Accession numbers S70723 (from the 5S 
rDNA of barley (Hordeum vufgare)), AF013103 and X03989 (from maize 

10 (Zea mays)), X65489 (from potato (Solanum tuberosum)), X52265 (from 

tomato (Lycopersicon escuientum)) , AF177418 (from Arabidopsis negiecta), 
AF1 77421 and AF17422 (from Arabidopsis halleri), A71562, XI 5550, 
X52631, U43224, X52320, X52636 and X52637 (from Arabidopsis 
thaiiana; see Gruendler et ai. (1991) J. Mot. BioL 221: 1209-1 222 and 

15 Gruendler et ai. (1989) Nucieic Acids Res. 1 7:6395-6396), X54194 [from 
rice (Oryza sativa)] Y08422 and D76443 [from tobacco (Nicotiana 
tabacum)], AJ243073 [from wheat (Triticum boeoticum)] and X07841 [from 
wheat (Triticum aestivum)]. Sequences of intergenic spacer regions of plant 
rDNA further include sequences from rye [see Appels et ai. (1986) Can. J. 

20 Genet. CytoL 25:673-685], wheat [see Barker et ai. (1988) J. Mot. BioL 

207:1-17 and Sardana and Flavell (1996) Genome 55:288-292], radish [see 
Delcasso-Tremousaygue et ai. (1988) Eur. J. Biochem. 172:767-776], Vicia 
faba and Pisum sativum [see Kato et ai. (1990) Plant Mot. BioL 74:983-993], 
mung bean [see Gerstner et at. (1988) Genome 50:723-733; and Schiebel et 

25 aL (1989) Mol. Gen. Genet. 273:302-307], tomato [see Schmidt-Puchta et 
at. (1989) Plant Mol. BioL 73:251-253], Hordeum bulbosum [see Procunier et 
at. (1990) Plant MoL BioL 75:661-663], Lens culinaris Medik., and other 
legume species [see Fernandez et aL (2000) Genome 43:597-603] and 
tobacco [see U.S. Patent Nos. 6,100,092 and 6,096,546 and PCT 

30 Application Publication No. WO99/66058; Borysyuk et aL (1997) Plant MoL 
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Biol. 35:655-660); Borysyuk eta/. (2000) Nature Biotechnology 78:1303- 
1306]. 

Mammalian rDNA sequences include, but are not limited to, DNA of 
GENBANK accession no. X82564 and portions thereof, the DNA of 
5 GENBANK accession no. U 13369 and portions thereof and DNA sequences 
provided in PCT Application Publication No. WO97/40183 (particularly SEQ. 
ID. NOS, 18-24 of W097/40183). A particular vector for use in directing 
integration of heterologous nucleic acid into chromosomal rDNA is pTERPUD 
(see PCT Application Publication No. WO97/40183). Satellite DNA 

10 sequences can also be used to direct the heterologous DNA to integrate into 
the pericentric heterochromatin. For example, vectors pTEMPUD and 
pHASPUD, which contain mouse and human satellite DNA, respectively (see 
PCT Application Publication No. WO97/40183), are examples of vectors that 
may be used for introduction of heterologous nucleic acid into cells for de 

15 novo chromosome formation leading to artificial chromosomes. 

3. Methods for introduction of heterologous nucleic acids into host 
cells 

Any methods known in the art for introducing heterologous nucleic 
acids into host cells may be used in the methods of preparing artificial 

20 chromosomes. The particular method used may depend on the type of cell 
into which the heterologous nucleic acid is being transferred. For example, 
methods for the physical introduction of nucleic acids into plant cells, for 
example, protoplasts and plant cells in culture, include, but are not limited to 
polyethylene glycol (PEG)-mediated DNA uptake, electroporation, lipid- 

25 mediated delivery, including liposomes, calcium phosphate-mediated DNA 
uptake, microinjection, particle bombardment, silicon carbide whisker- 
mediated transformation and combinations of these methods, for example 
methods utilizing combinations of calcium phosphate and PEG for DNA 
uptake or methods utilizing a combination of electroporation, PEG and heat 

30 shock (see, e.g., U.S. Patent Nos. 5,231,019 and 5,453,367). Physical 
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methods such as these are known in the art and are effective in introducing 
DNA into a variety of dicotyledonous and monocotyledonous plants [see, 
e.g., Paszkowski etal. (1984) EMBO J. 5:2717-2722; Potrykus etal. (1985) 
MoL Gen. Genet 799: 169- 177; Reich etal. (1986) Biotechnology 4:1001- 
5 1004; Klein etal. (1987) Nature 327:70-73; U.S. Patent No. 6,143,949; 
Paszkowski etal. (1989) in Ceil Culture and Somatic Cell Genetics of Plants, 
Vol. 6, Molecular Biology of Plant Nuclear Genes, eds. Schell, J and Vasil, 
L.K. Academic Publishers, San Diego, California, p. 52-68; and Frame etal. 
(1994) Plant J. 5:941-948]. 

10 In addition to these methods for the introduction of nucleic acids into 

plant cells based on physically, mechanically or chemically meidated 
processes, it is possible to introduce nucleic acids into plant cells by 
biological methods, such as those utilizing Agrobacterium. In this method, 
nucleic acid sequences located adjacent to T-DNA border repeats can be 

15 inserted into the genome of a plant cell, typically dicotyledonous plant cells, 
by utilizing the encoded function for DNA transfer found in the genus 
Agrobacterium. This method has also been shown to work for some 
monocotyledonous plant cells, such as rice cells. 

Any method for introducing nucleic acids into plant ceils can be used 

20 in the generation of artificial chromosomes, provided the method is capable 

of introducing the nucleic acid into an amplifiable region of a chromosome, 

for example, heterochromatin, and particularly in close proximity to a 

megareplicator region of a plant chromosome. 

a. Agrobacterium-mediated introduction of nucleic acids 
25 into plant cells 

Agrobacterium-med'iated transformation is particularly well-suited for 

transformation of dicotyledons because of its high efficiency of 

transformation and its broad utility with many different species, including 

tobacco, tomato (see, e.g., European Patent Application no. 0 249 432), 

30 sunflower, cotton (see, e.g., European Patent Application no. 0 317 511), 
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oilseed rape, potato, soybean, alfalfa and poplar (see, e.g., U.S. Patent No. 
4,795,855) (see also PCT Application Publication no. WO87/07299 with 
respect to transformation of Brassica). Agrobacterium-medxaled 
transformation has also been used to transfer nucleic acids into 
5 monocotyledonous plants. Agrobacterium-vnedlated transformation of 

Ch/orophytum capense and Narcissus cv "Paperwhite" [see, e.g., Hooykaas- 
Van Slogteren et al. (1984) Nature 3/7:763-764], corn and wheat [see, e.g., 
U.S. Patent Nos. 5,164,310, 5,187,073 and 5,177,010 and Mooney et al. 
(1991) Plant Cell, Tissue, Organ Culture 25:209-218], rice [see, e.g., Raineri 

10 etai. (1990) Bio/Technology 5:33-38 and Chan et al. (1993) Plant MoL Biol. 
22:491-506] and barley [see, e.g., Tingay et ah (1997) The Plant J. 

369-1 376 and Qureshi etai. (1998) Proc. 42nd Conference of 
Australian Society for Biochemistry and Molecular Biology, September 28- 
October 1, 1998, Adelaide Australia] has been reported. 

15 Agrobacterium-medlated delivery of nucleic acids is based on the 

capacity of certain Agrobacterium strains to introduce a part of their Ti 
(tumor-inducing) plasmid, i.e., the transforming DNA or T-DNA, into plant 
cells and to integrate this T-DNA into the genome of the cells. The part of 
the Ti plasmid that is transferred and integrated is delineated by specific DNA 

20 sequences, the left and right T-DNA border sequences. The natural T-DNA 
sequences between these border sequences can be replaced by foreign DNA 
[see, e.g., European Patent Publication 116 718 and Deblaere etai. (1987) 
Meth. Enzymol. 753:277-293]. 

When Agrobacterium is used for transformation, the heterologous 

25 nucleic acid being transferred typically is cloned into a plasmid that contains 
T-DNA border regions and is replicated independently of the Ti plasmid 
(referred to as the binary vector system) or the heterologous nucleic acid is 
inserted between the T-DNA borders of the Ti plasmid (referred to as the co- 
integrate method). In co-integrate methods, these vectors are be integrated 

30 into the Ti or Ri plasmid by homologous recombination owing to sequences 
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that are homologus to sequences within the T-DNA region of the Ti or Ri 
plasmid. The Ti or Ri plasmid also contains the vir region necessary for 
transfer of the T-DNA. 

Intermediate vectors cannot replicate in Agrobacteria . The 
5 intermediate vector can be transferred into Agrobacterium by means of a 
helper plasmid (conjugation, see Fraley et aL (1983) Proc. Natl. Acad. Sci. 
USA 80:4803). This method, typically referred to as triparental mating, 
introduces the heterologous nucleic acid sequence into the bacterium and 
allows for selection of a homologous recombination event that produces the 

10 desired Agrobacterium genotype. The triparental mating procedure typically 
employs Escherichia coii carrying the recombinant intermediate vector and a 
helper E. coii strain which carries a plasmid that is able to mobilize the 
recombinant intermediate vector to the target Agrobacterium strain. A 
modified Ti or Ri plasmid is obtained from the transfer and selection process, 

15 which contains a heterologous nucleic acid sequence located within the T- 
DNA region. The resultant Agrobacterium strain is capable of transferring 
the heterologous nucleic acid to plant cells. 

Binary vectors can replicate both in E. coii and Agrobacterium. They 
typically contain a selection marker gene and a linker or polylinker which are 

20 flanked by the right and left T-DNA border regions and can be transformed 
directly into Agrobacterium [see, e.g., Hofgen and Wilmitzer (1988) Nuc. 
Acids. Res. 75:9877 and Holsters et ai. (1978) Mol. Gen. Genet. 753:181- 
187] or introduced through triparental mating. The Agrobacterium host cell 
contains a plasmid carrying a vir region needed for transfer of the T-DNA into 

25 a plant cell [see, e.g., White in Plant Biotechnology, eds. Kung, S. and 

Arntzen, C.J., Butterworth Publishers, Boston, Mass., (1989) p. 3-34 and 
Fraley in Plant Biotechnology, eds. Kung, S. and Arntzen, C.J., Butterworth 
Publishers, Boston, Mass., (1989) p. 395-407]. 

Agrobacterium-mediated transformation typically involves the transfer 

30 of a binary vector carrying the heterologous nucleic acid of interest to an 
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appropriate Agrobacterium strain, which may depend on the complement of 
vir genes carried by the host Agrobacterium strain either on a co-resident Ti 
plasmid or chromosomally (see, e.g., Uknes et al. (1993) Plant Cell 5:159- 
1 69). The transfer of a recombinant binary vector to Agrobacterium is 
5 acomplished by a triparental mating procedure using Eschreichia colt carrying 
the recombinant binary vector, a helper E. coli strain which carries a plasmid 
which is able to mobilize the recombinant binary vector to the target 
Agrobacterium strain. Alternatively, the recombinant binary vector can be 
transferred to Agrobacterium by DNA transformation (see, e.g., Hofgen & 

10 Willmitzer (1988) Nuc. Acids. Res. 76:9877). 

Many vectors are available for transfer of nucleic acids into 
Agrobacterium tumefaciens [see, e.g., Rogers et al. (1987) Methods in 
Enzymol. 753:253-277], These typically carry at least one T-DNA border 
sequence and include vectors such as pBIN19 [see, e.g., Bevan (1984) Nuc. 

15 Acids. Res. 72:8711-8721]. Typical vectors suitable for Agrobacterium 

transformation include the binary vectors pCIB200 and pCIB2001, as well as 
the binary vector pCIBIO and hygromycin selection derivatives thereof (see, 
e.g., U.S. Patent No. 5,639,949). Other vectors that can be employed are 
the pCambia vectors (see www.cambia.org), including, for example, 

20 pCambia 3300 and pCambia 1302 (GenBank Accession No. AF234298). 

A particularly useful Ti plasmid cassette vector for the transformation 
of dicotyledonous plants contains the enhanced CaMV35S promoter (EN35S) 
and the 3' end, including polyadenylation signals, of a soybean gene 
encoding the a subunit of yff-conglycinin. Between these two elements is a 

25 multifinker containing multiple restriction sites for the insertion of genes of 
interest (see, e.g., U.S. Patent No. 6,023,013). The vector can contain a 
segment of pBR322 which provides an origin of replication in E. coli and a 
region for homologous recombination with the disarmed T-DNA in 
Agrobacterium strain ACO; the oriV region from the broad host range 

30 plasmid RK1; the streptomycin/spectinomycin resistance gene from Tn7; and 
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a chimeric NPTII gene, containing the CaMV35S promoter and the nopaline 
synthase (NOS) 3' end, which provides kanamycin resistance in transformed 
plant cells. Optionally, the enhanced CaMV35S promoter may be replaced 
with the 1.5 kb mannopine synthase (MAS) promoter (see, e.g., Velton et al. 
5 (1984) EMBO J. 3:2723-2730). After incorporation of a DNA construct into 
the vector, it is introduced into A. tumefaciens strain ACO which contains a 
disarmed Ti plasmid. Cointegrate Ti plasmid vectors are selected and 
subsequentally may be used to transform a dicotyledenous plant. 
Transformation of the target plant species by recombinant 

10 Agrobacterium usually involves co-cultivation of the Agrobacterium with 
explants from the plant and follows published protocols. Methods of 
inoculation of the plant tissue vary depending upon the plant species and the 
Agrobacterium delivery system. The plant tissue can be either protoplast, 
callus or organ tissue, depending on the plant species. A widely used 

15 approach is the leaf disc procedure which can be performed with any tissue 
explant that provides a good source for initiation of whole plant 
differentiation (see, e.g., Horsch et ah in Plant Molecular Biology Manual A5 f 
Kluwer Academic Publishers, Dordrecht (1988) p. 1-9 and U.S. Patent No. 
6,136,320). The addition of nurse tissue may be desirable under certain 

20 conditions. There are multiple choices of Agrobacterium strains (including, 
but not limited to, A. tumefaciens and A. rhizogenes) and plasmid 
construction strategies that can be used to optimize genetic transformation 
of plants. Transformed tissue carrying an antibiotic or herbicide resistance 
marker present between the binary plasmid and T-DNA borders can be 

25 regenerated on selectable medium. 

A. tumefaciens ACO is a disarmed strain similar to pTiB6SE (see 
Fraley et al. (1985) Bio/Technology 3:629-635). For construction of ACO, 
the starting Agrobacterium strain was A208 which contains a nopaline-type 
Ti plasmid. The Ti plasmid was disarmed in a manner similar to that 

30 described by Fraley et al. (1985) Bio/Technology 3:629-635) so that 
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essentially all of the native T-DNA was removed except for the left border 
and a few hundred base pairs of T-DNA inside the left border. The remainder 
of the T-DNA extending to a point just beyond the right border was replaced 
with a piece of DNA including (from left to right) a segment of pBR322, the 
5 oriV region from plasmid RK2, and the kanamycin resistance gene from 
Tn601. The pBR322 and oriV segments are similar to these segments and 
provide a region of homology for cointegrate formation (see U.S. Patent No. 
6,023,013). Another useful strain of Agrobacterium is A. tumefaciens strain 
GV3101/pMP90 [see, e.g., Koncz and Schell (1986) Mot. Gen. Genet. 

10 204:383-396]. 

Advances in Agrobacterium-vnedlated transfer allow introduction of 
larger segments of nucleic acids [see, e.g., Hamilton (1997) Gene 4:200(1- 
2):107-116; Hamilton et al. (1996) Proc. Natl. Acad. Sci. U.S.A. 33:9975- 
9979; Liu et aL (1999) Proc. Natl. Acad. Sci. U.S.A. 35:6535-6540]. The 

15 vectors used in these methods are designed to have the characteristics of 
both bacterial artificial chromosomes (BACs) and binary vectors for 
Agrobacterium-med\ated transformation. Therefore, somewhat larger DNA 
fragments cloned in the T-DNA region can be transferred into a plant genome 
by Agrobacterium. Binary bacterial artificial chromosome (BIBAC) vector 

20 BIBAC2 (see U.S. Patent No. 5,733,744; available from the Plant Science 
Center, Cornell University) and the transformation-competent bacterial 
artificial chromosome (TAC) vector pYLTAC7 (available from the Plant Cell 
Bank of the RIKEN Gene Bank, Tsukuba, Japan) are examples of the types of 
vectors that may be used in transferring larger segments of nucleic acids, 

25 particularly heterologous nucleic acids containing targeting and/or selectable 
marker sequences as described herein, into plants via Agrobacterium- 
mediated DNA transfer processes. 

Introduction of heterologous nucleic acids into plant cells without the 
use of Agrobacterium circumvents the requirements for T-DNA sequences in 

30 the transformation vector and consequently vectors lacking these sequences 
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can be utilized in addition to vectors containing T-DNA sequences. 
Techniques for nucleic acid transfer that do not rely on Agrobacterium 
include transformation via particle bombardment, direct DNA uptake (e.g., 
PEG, lipids, electroporation) and mechanical methods such as microinjection 
5 or silicon "whiskers". The choice of vector that may be used in introduction 
of heterologous nucleic acids into plant cells can involve largely on the 
preferred selection for the species being transformed. Typical vectors 
suitable for transformation without Agrobacterium include pCIB3064, 
pSOG19 and pSOG35 (see, e.g., U.S. Patent No. 5,639,949), or common 

10 plasmid, phage or cosmid vectors. 

b. Direct DNA Uptake 
Introduction of heterologous nucleic acids into plant cells may be 
achieved using a variety of methods that facilitate direct DNA uptake, 
including calcium phosphate precipitation, polyethylene glycol (PEG) 

15 treatment, electroporation, and combinations thereof [see, e.g., Potrykus et 
ah (1985) MoL Gen. Genet. 799:183; Lorz et aL (1985) MoL Gen. Genet. 
799:178; Fromm et aL (1985) Proc. NatL Acad. ScL U.S.A. 52:5824-5828; 
Uchimiya et aL (1986) MoL Gen. Genet. 204:204-; Callis eta/. (1987) Genes 
Dev. 7:1183-2000; Callis et aL (1987) Nuc. Acids Res. 75:5823-5831; 

20 Marcotte et aL (1988) Nature 355:454, Toriyama et aL (1988) 

Bio/Technoiogy 5:1072-1074; Haim et aL (1985) MoL Gen. Genet. 799:161- 
168; Deshayes et aL (1985) EMBO J. 4:2731-2737; Krens et aL (1982) 
Nature 296:72-74; Crossway et aL (1986) MoL Gen. Genet. 20:179]. 

Typically, plant protoplasts are used for direct DNA uptake, or in some 

25 instances plant tissue that has been treated to remove a portion or the 

majority of the cell wall (see, e.g., PCT Publication No. W093/21335 and 
U.S. Patent No. 5,472,869). Removal of the cell wall is believed to facilitate 
entry of DNA into plant cells, although in some instances electroporation may 
be used to introduce DNA into specialized plant cells, e.g., electroporation of 

30 pollen, without first removing the cell wall. 
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Techniques for the preparation of callus and protoplasts from maize, 

transformation of protoplasts using PEG or eiectroporation, and the 

regeneration of maize plants from transformed protoplasts are found, for 

example, in European Patent Application nos. 0 292 435 and O 392 225 and 

5 PCT Application Publication no. W093/07278. Transformation of rice can 

also be undertaken by direct gene transfer techniques utilizing protoplasts 

[see, e.g., Zhang eta/. (1988) Plant Cell Rep. 7:379-384; Shimamoto et al. 

(1989) Nature 338:274-277; Datta et al. (1990) Biotechnology 5:736-740]. 

The regeneration of fertile transgenic barley by direct DNA transfer to 

10 protoplasts is described, for example, by Funatsuki et al. [(1995) Theor. 

Appl. Genet. 37:707-712]. Other plant species, including tobacco and 

Arabidopsis, may also serve as sources of protoplasts for use in introduction 

of heterologous nucleic acids into plant cells. 

c. Particle bombardment-mediated introduction of nucleic 
1 5 acids into plant cells 

Microprojectile bombardment of plant cells can be an effective method 

for the introduction of nucleic acids into plant cells. In these methods, 

nucleic acids are carried through the cell wall and into the cytoplasm on the 

surface of small, typically metal, particles [see, e.g., Klein et al. (1987) 

20 Nature 327:70; Klein et al. (1988) Proc. Natl. Acad. Sci. U.S.A. 35:8502- 
8505, Klein et al. In Progress in Plant Cellular and Molecular Biology, eds. 
Nijkamp, H.J.J., Van der Plas, J.H.W., and Van Aartrijk, J., Kluwer 
Academic Publishers, Dordrecht, (1988), p. 56-66; Seki et al. (1999) Mol. 
Biotechnol. 7 7:251-255; and McCabe et al. (1988) Bio/Technology 6:923- 

25 926]. Particles may be coated with nucleic acids and delivered into cells by 
a propelling force. Exemplary particles include those containing tungsten, 
gold or plantinum, as well as magnesium sulfate crystals. The metal 
particles can penetrate through several layers of cells and thus allow the 
transformation of cells within tissue explants. 
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In an illustrative embodiment [see, e.g., U.S. Patent No. 6,023,013] of 
a method for delivering nucleic acids into plant cells, e.g., maize cells, by 
acceleration, a Biolistics Particle Delivery System may be used to propel 
particles coated with DNA or cells through a screen, such as a stainless steel 
5 or Nytex screen, onto a filter surface covered with plant {e.g., corn) cells 
cultured in suspension. The screen disperses the particles so that they are 
not delivered to the recipient cells in large aggregates. The intervening 
screen between the projectile apparatus and the cells to be bombarded may 
reduce the size of projectile aggregates and may contribute to a higher 

10 frequency of transformation by reducing damage inflicted on the recipient 
cells by projectiles that are too large. 

For the bombardment, cells in suspension may be concentrated on 
filters or solid culture medium. Alternatively, immature embryos or other 
target cells may be arranged on solid culture medium. The cells to be 

15 bombarded are typically positioned at an appropriate distance below the 

macroprojectile stopping plate. If desired, one or more screens may also be 
positioned between the acceleration device and the cells to be bombarded. 

The prebombardment culturing conditions and bombardment 
parameters may be optimized to yield the maximum numbers of stable 

20 transformants. Both the physical and biological parameters for bombardment 
can be important in this technology. Physical factors include those that 
involve manipulating the DNA/microprojectile precipitate or those that affect 
the flight and velocity of either the macro- or microprojectiles. Biological 
factors include all steps involved in manipulation of cells before and 

25 immediately after bombardment, the osmotic adjustment of target cells to 
help alleviate the trauma associated with bombardment, and also the nature 
of the transforming nucleic acid, such as linearized DNA or intact supercoiled 
plasmids. 

Physical parameters that may be adjusted include gap distance, flight 
30 distance, tissue distance and helium pressure. In addition, transformation 
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may be optimized by adjusting the osmotic state, tissue hydration and 

subculture stage or cell cycle of the recipient cells. 

Techniques for transformation of A188-derived maize line using 

particle bombardment are desribed in Gordon-Kamm era/. [(1990) Plant Cell 

5 2:603-618] and Fromm et al. [(1990) Biotechnology 5:833-839]. 

Transformation of rice may also be accomplished via particle bombardment 

[see, e.g., Christou et al. (1991) Biotechnology 9:957-962]. Particle 

bombardment may also be used to transform wheat [see, e.g., Vasil et al. 

(1992) Biotechnology 70:667-674 for transformation of cells of type C long- 

10 term regenerable callus; and Weeks et al. (1993) Plant Physiol. 702:1077- 

1084 for transformation of wheat using particle bombardment of immature 

embryos and immature embryo-derived callus]. The production of transgenic 

barley using bombardment methods is described, for example, by Koprek et 

al. [(1 996) Plant Set. 779:79-91]. 

15 d. EKectroporation-mediated introduction of nucleic acids 

into plant cells 

The application of brief, high-voltage electric pulses to a variety of 
animal and plant cells leads to the formation of nanometer-sized pores in the 
plasma membrane. Nucleic acids are taken directly into the cell cytoplasm 

20 either through these pores or as a consequence of the redistribution of 
membrane components that accompanies closure of the pores. 
Electroporation can be extremely efficient and can be used both for transient 
expression of cloned genes and for the establishment of cell lines that carry 
integrated copies of the gene of interest. 

25 Certain cell wall-degrading enzymes, such as pectin-degrading 

enzymes, may be employed to render the target recipient cells more 
susceptible to transformation by electroporation than untreated cells. 
Alternatively, recipient cells may be more susceptible to transformation by 
mechanical wounding. To effect transformation by electroporation, friable 

30 tissues such as a suspension culture of cells or embryonic callus may be 
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used or immature embryos or other organized tissues may be directly 
transformed [see, e.g., Fromm et al. (1986) Nature 3/9:791-793; and 
Neuman et al. (1982) EMBO J. 7:841-845], 

e. Microinjection-mediated introduction of nucleic acids into 
5 plant cells 

In microinjection techniques, nucleic acids are mechanically injected 

directly into cells using very small micropipettes. For example, microinjection 

of protoplast cells with foreign DNA for transformation of plant cells has 

been reported for barley and tobacco [see, e.g., Holm et al. (2000) 

10 Transgenic Res. 9:21-32 and Schnorf et al. Transgenic Res. 7:23-30]. 

f . Lipid-mediated introduction of nucleic acids into plant 
cells 

In lipid-mediated transfer, nucleic acids are contacted with lipids 
and/or encapsulated in lipid-containing structures, including but not limited to 

15 liposomes, and the liposome-containing nucleic acids are fused with plant 
protoplasts. The fusion can occur in the presence or absence of a fusogen, 
such as PEG. Lipid-mediated transformation of plant protoplasts has been 
reported [see e.g., Fraley and Papahadjopoulos (1982) Curr. Top. Microbiol. 
Immunol. 96:171-191; Deshayes et al. (1985) EMBO J. 4:2731-2737 and 

20 Spoerlein and Koop (1991) Theor. Appl. Genetics 53:1-5]. 

g. Other methods of introduction of nucleic acids into plant 
cells 

Other methods to physically introduce nucleic acid into plant cells may 
be used, including silicon carbide fibers ("whiskers") that are used to pierce 
25 plant cell walls thereby facilitating nucleic acid uptake, the use of sound 
waves to introduce holes in plant cell membranes to facilitate nucleic acid 
uptake (e.g., sonoporation) and the use of laser beams to open holes in cell 
membranes facilitating the entry of nucleic acids (e.g., laser poration). 

Nucleic acids may also be imbibed by hydrating plant tissue, providing 
30 another method for nucleic acid uptake into plant cells [see, e.g., Simon 
(1974) New Phytologist 37:377-420], For example, nucleic acids may be 
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taken into cereal and legume seed embryos by inbibition [see, e.g., Toepfer 

eta/. (1989) The Plant Cell /:133-139]. 

4. Treatment of cells into which heterologous nucleic acids have 
been introduced 

5 Cells into which heterologous nucleic acids have been introduced may 

be analyzed for de novo formation of artificial chromosomes described herein 
such as may result from amplification of chromosomal segments occurring in 
connection with integration of heterologous nucleic acids into chromosomes. 
Typically, amplification occurs over multiple generations of cell division 

10 leading to the formation of detectable changes in chromosome structure. 
Therefore, transfected cells are typically cultured through multiple cell 
divisions, from about 5 to about 60, or about 5 to about 55, or about 10 to 
about 55, or about 25 to about 55, or about 35 to about 55 cell divisions 
following introduction of nucleic acid into a cell. Artificial chromosomes 

15 may, however, appear after only about 5 to about 15 or about 10 to about 
15 cell divisions. Cells into which heterologous nucleic have been introduced 
may be treated in a variety of ways prior to or during analysis thereof for the 
presence of artificial chromosomes. 

For example, cells into which nucleic acid encoding a selectable 

20 marker required for growth in the presence of a selection agent has been 
transferred can be treated as the exemplified cells herein to facilitate 
generation of multicentric chromosomes, and fragmentation thereof, and/or 
the generation of artificial chromosomes. The cells may be grown in the 
presence of an appropriate concentration of selection agent, which may be 

25 determined empirically by growing untransfected cells in varying 

concentrations of the agent and identifying concentrations sufficient to 
prevent cell growth and/or facilitate amplification of chromosomal segments. 
Transfected cells may be grown in selective media for numerous generations 
and cell lines can be established that contain the introduced nucleic acid. 

30 The concentration of selection agent may also be increased over several 
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generations to promote amplification of a region of a chromosome into which 
heterologous nucleic acid integrated. Transfected cells may also be treated 
to destabilize the chromosomes to facilitate generation and fragmentation of 
a multicentric, typically dicentric, chromosome. 
5 Additional heterologous nucleic acid, e.g., nucleic acid encoding a 

selectable marker, may also be introduced into the transfected cells to 
facilitate amplification of chromosomal segments, such as the pericentric 
heterochromatin, contained in, for example, a fragment released from a 
multicentric chromosome (e.g., a formerly dicentric chromosome), and 

10 generation of a heterochromatic artificial chromosome. The resulting 

transformed cells can then be grown in the presence of a selection agent, 
which may be a second agent (if the heterologous nucleic acid introduced 
into the transfected cells encodes a selectable marker different from any 
selectable marker encoded by heterologous nucleic acid initially transferred 

15 into the original host cells), with or without the first selection agent. 

Cells into which nucleic acids have been introduced may also be 
subjected to cell sorting. For example, protoplasts may be prepared from 
transfected plant cells or calli and subjected to sorting. If the sorting is 
conducted prior to chromosomal analysis of the cells for the presence of 

20 artificial chromosomes, it provides a population of transfected cells that may 
be enriched for artificial chromosomes and thus facilitates the subsequent 
chromosomal analysis of the cells. 

The sorting is based on the presence of a detectable marker in the 
cells, as provided for by the introduced nucleic acid, which can provide the 

25 basis for isolating such cells from cells that do not contain the heterologous 
nucleic acid. For example, the nucleic acid introduced into the plant cells 
may contain nucleic acid encoding a fluorescent protein, such as a green, red 
or blue fluorescent protein, which may be used for selection, by flow 
cytometry and other methods, of recipient cells that have taken up and 

30 express the nucleic acid at readily detected levels. 
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In an exemplary protocol, GFP fluorescence of transfected cell cultures 
may be monitored visually during culture using an inverted microscope 
equipped with epifluorescence illumination (Axiovert 25; Zeiss, (North York 
ON), and #41017 Endow GFP filter set (Chroma Technologies, Brattleboro, 
5 VT). Enrichment of GFP expressing populations can be carried out as 

. follows- Cell sorting may be carried out, for example, using a FACS Vantage 
flow cytometer (Becton Dickinson Immunocytometry Systems, San Jose, 
CA) equipped with turbo-sort option and 2 Innova 306 lasers (Coherent, Palo 
Alto CA). For cell sorting a 70 //m nozzle can be used. The buffer can be 

10 changed to PBS (maintained at 20 p.s.i.).. GFP may be excited with a 488 
nm laser beam and excitation detected in FL1 using a 500 EFLP filter. 
Forward and side scattering can be adjusted to select for viable cells. Gating 
parameters may be adjusted using untransfected cells as negative control 
and GFP CHO cells as positive control. 

1 5 For the first round of sorting, transfected cells may be harvested post- 

transfection (e.g., about 7-14 days post-transfection), converted to 
protoplasts, resuspended in about 1 0 ml of growth medium and sorted for 
GFP-expressing populations using parameters described above. GFP-positive 
cells may be dispensed into a volume of about 5-10 ml of protoplast medium 

20 while non-expressing cells are directed to waste. The expressing cells may 

be cultured. Plant cells or calli can then be analyzed, for fluorescence in-situ 

hybridization screening. 

5. Analysis of transformed cells and identification and 
manipulation of artificial chromosomes 

25 Cells into which nucleic acids have been introduced, and which may 

or may not have been further treated as described herein, may be analyzed 
for indications of amplification of chromosomal segments, the presence of 
structures that may arise in connection with amplification and de novo 
artificial chromosome formation and/or the presence of desired artificial 

30 chromosomes as described herein. Analysis of the cells typically involves 
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methods of visualizing chromosome structure, including, but not limited to, G- 
and C-banding, PCR, Southern blotting and FISH analyses, using techniques 
described herein and/or known to those of skill in the art. Such analyses can 
employ specific labelling of particular nucleic acids, such as satellite DNA 
5 sequences, heterochromatin, rDNA sequences and heterologous nucleic acid 
sequences, that may be subject to amplification. During analysis of 
transfected cells, a change in chromosome number and/or the appearance of 
distinctive, for example, by increased segmentation arising from amplification 
of repeat units, chromosomal structures will also assist in identification of 

10 cells containing artificial chromosomes. The following description of events 
and structures that may be observed in analyzing cells for evidence of 
chromosomal amplification and/or the presence of artificial chromosomes is 
intended to be illustrative of the observations and considerations that may 
occur in the analysis of cells of any type, including mammalian and plant 

15 cells. It should be recognized that numerous types of structures may be 
formed during amplification of chromosomal segments and treatment of the 
cells. Additional, yet related, structures and variations of these structures 
are contemplated herein and are recognizable based on the descriptions and 
teachings of the generation and identification of artificial chromosomes 

20 presented herein. Each structure can be further manipulated, for example 
using procedures described herein, to derive additional chromosomal 
structures and compositions. 

Typically, de novo centromere formation occurs in cells upon 
integration of heterologous nucleic acids into the cell chromosomes and 

25 amplification of chromosomal and heterologous nucleic acids. The 

integration and amplification that gives rise to de novo centromere formation 
typically occurs at the centromeric region of the short arm of a chromosome, 
typically an acrocentric chromosome. By employing methods such as 
chromosome-staining methods, including FISH and G-and C-banding, it may 

30 be possible to identify a chromosome at which the process occurs. 
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The amplification can lead to the formation of multicentric, typically 
dicentric, chromosomes. Because of the presence of two or more 
functionally active centromeres on the same chromosome, regular breakages 
occur between the centromeres. Such specific chromosome breakages can 
5 give rise to the appearance of a chromosome fragment carrying a neo- 

centromere. The neo-centromere may be found on a minichromosome {neo- 
minichromosome), while a formerly dicentric chromosome may carry traces 
of the heterologous nucleic acid. 

a. The neo-minichromosome 

10 Breakage of a dicentric chromosome between the two functional 

centromeres can form at least two chromosomes, for example, a so-called 
minichromosome, and a formerly dicentric chromosome. Treatment of cells 
containing a dicentric chromosome, such as, for example, recloning, 
treatment with agents that destabilize the chromosomes, e.g., BrdU, and/or 

15 culturing under selective conditions, may facilitate breakage of the dicentric 
chromosome. Selection of transformed cells can yield cell lines containing a 
stable neo-minichromosome. The breakage of a multicentric, typically 
dicentric, chromosome in transformed cells, which separates the neo- 
centromere from the remainder of the endogenous chromosome, may occur, 

20 for example, in the G-band positive heterologous nucleic acid region as is 

suggested if traces of the heterologous nucleic acid sequences at the broken 
end of the formerly dicentric chromosome are observed. 

Multiple E-type amplification (amplification of euchromatin) may form a 
neo-chromosome, which separates from the remainder of the dicentric 

25 chromosome through a specific breakage between the centromeres of the 

dicentric chromosome. Inverted duplication of the fragment bearing the neo- 
centromere can result in the formation of a stable neo-minichromosome. The 
minichromosome is generally about at least 20-30 Mb in size. 

The presence of inverted chromosome segments can be associated 

30 with the chromosomes formed de novo at the centromeric region of a 
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chromosome. During the formation of the neo-minichromosome, the event 
leading to the stabilization of the distal segment of the chromosome that 
bears the duplicated neo-centromere may be the formation of its inverted 
duplicate. 

5 Although the neo-minichromosome typically carries only one functional 

centromere, both ends of the minichromosome can be heterochromatic, 
carrying, for example, satellite DIMA sequences as discernable by in situ 
hybridization. Comparison of the G-band pattern of a chromosome fragment 
carrying the neo-centromere with that of a stable neo-minichromosome, can 

10 indicate that the neo-minichromosome is an inverted duplicate of the 
chromosome fragment that bears the neo-centromere. 

Cells containing a de novo-formed minichromosome, which contains 
multiple repeats of the heterologous nucleic acids, can be used as recipient 
ceils in cell transfection. Donor nucleic acids, such as heterologous nucleic 

15 acids containing DNA encoding a desired protein and DNA encoding a 

second selectable marker, can be introduced into the cells and integrated into 
the de novo-formed minichromosomes. To facilitate integration into the de 
novo-formed minichromosomes, the heterologous DNA may also contain 
sequences that are homologous to nucleic acids already present in the 

20 minichromosomes, which can, through homologous recombination, provide 
targeted integration into the minichromosome. Nucleic acids can also be 
integrated into the minichromosome through the use of site-specific 
recombinases by producing minichromosomes containing site-specific 
recombination sites as described herein. Integration can be verified by in situ 

25 hybridization and Southern blot analyses. Transcription and translation of 
heterologous DNA can be confirmed by primer extension, immunoblot 
analyses and reporter gene assays, if a reporter gene has been included in 
the heterologous DNA, using, for example, appropriate nucleic acid probes 
and/or product-specific antibodies. 
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The resulting engineered minichromosome that contains the heterolo- 
gous DNA can also be transferred, for example by cell fusion, into a recipient 
cell line to further verify correct expression of the heterologous DNA. 
Following production of the cells, metaphase chromosomes can be obtained, 
5 such as by addition of colchicine, and the minichromosomes purified using 
methods as described herein. The resulting minichromosomes can be used 
for delivery to specific cells of interest using any known method or methods 
for transferring heterologous nucleic acids into cells, particularly plant cells, 
and/or methods described herein. 

10 Thus, the neo-minichromosome is stably maintained in cells, replicates 

autonomously, and permits the persistent, long-term expression of genes 
under non-selective culture conditions, and in a whole, intact, regenerated 
plant. It also can contain megabases of heterologous known DNA that can 
serve as target sites for homologous recombination and integration of DNA 

15 of interest. The neo-minichromosome is, thus, a vector for the delivery and 
expression of nucleic acids to cells. 

Cell lines that contain artificial chromosomes, such as the 
minichromosome, the neo-chromosome, and the heterochromatic artificial 
chromosomes, are a convenient source of these chromosomes and can be 

20 manipulated, such as by cell fusion or production of microcells for fusion 
with selected cell lines, to deliver the chromosome of interest into a 
multiplicity of cell lines, including cells from a variety of different plant 
species. 

b. Heterochromatin-containing and predominantly 
25 heterochromatic artificial chromosomes 

Manipulation of cells containing a fragment released upon breakage of 

the dicentric chromosome (e.g., a formerly dicentric chromosome), for 

example, by introducing additional heterologous nucleic acids, including, for 

example, DNA encoding a second selectable marker and growth under 

30 selective conditions, can yield heterochromatic structures. Included among 
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such structures are compositions referred to as sausage chromosomes and 
megachromosomes. For example, a formerly dicentric chromosome may 
translocate to the end of another chromosome, such as an acrocentric 
chromosome. Additional heterologous nucleic acids added to cells containing 
5 a formerly dicentric chromosome can integrate into the pericentric 

heterochromatin of the formerly dicentric chromosome and be amplified 
several times with megabases of pericentric heterochromatic satellite DNA 
sequences forming a "sausage" chromosome carrying a newly formed 
heterochromatic chromosome arm. The size of this heterochromatic arm can 

10 vary, for example, between —150 and —800 Mb in individual metaphases. 
The chromosome arm can contain four to five satellite segments rich in 
satellite DNA, and evenly spaced integrated heterologous "foreign" DNA 
sequences. At the end of the compact heterochromatic arm of the sausage 
chromosome, a less condensed euchromatic terminal segment may be 

15 observed. By capturing a euchromatic terminal segment, this new 

chromosome arm is stabilized in the form of the "sausage" chromosome. In 
subclones of sausage chromosome-containing cell lines, the heterochromatic 
arm of the sausage chromosome may become unstable and show continuous 
intrachromosomal growth, particularly after treatment with BrdU and/or drug 

20 selection to induce further H-type amplification. In extreme cases, the 
amplified chromosome arm can exceed 500 Mb or even 1 000 Mb in size 
(gigachromosome). Thus, the gigachromsome is a structure in which a 
heterochromatic arm has amplified but not broken off from a euchromatic 
arm. 

25 In situ hybridization with, for example, biotin-labeled subfragments of 

the added heterologous nucleic acids may show a hybridization signal only in 
the heterochromatic arm of the sausage chromosome, indicating that the 
heterologous nucleic acid sequences are localized in the pericentric 
heterochromatin. 
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Gene expression, however, may be possible in the heterochromatic 
environment of a sausage chromosome. The level of heterologous gene 
expression may be determined by Northern hybridization with a subfragment 
of the selectable marker gene. Reporter genes included in heterologous 
5 nucleic acids also provide a readily detectable product for use in evaluating 
gene expression in a sausage or other heterochromatic or predominantly 
heterochromomatic chromosome. Southern hybridization of DNA isolated 
from subclones of sausage chromosome-containing cells with subfragments 
of reporter {and selectable marker) genes can show a close correlation 
10 between the intensity of hybridization and the length of the sausage 
chromosome. 

Cell lines containing sausage chromosomes can be manipulated to 
yield additional heterochromatic structures and artificial chromosomes, 
including, for example, an artificial chromosome referred to as a 
15 megachromosome. Such manipulation includes fusion of the cell line with 
other cells and growth in the presence of one or more selection agents 
and/or BrdU. 

Cells with a structure, such as the sausage chromosome, can be 
selected and fused with a second cell line, including other plant and non- 
20 plant species [see, e.g., Dudits et al. (1976) Heridltas 52:121-123 for the 
fusion of human cells with carrot protoplasts and Wiegand et al. (1987) J. 
Cell ScL fPt. 2^:145-149 for laser-induced fusion of plant protoplasts with 
mammalian cells] to eliminate other chromosomes that are not of interest- 
Structures such as sausage chromosomes formed during this process may be 
25 further manipulated, for example, by treating the cells with agents that 

destabilize chromosomes, e.g., BrdU, so that the heterochromatic arm forms 
a chromosome that is substantially heterochromatic {e.g., a 
megachromosome). Structures such as the gigachromosome in which the 
heterochromatic arm has amplified but not broken off from the euchromatic 
30 arm, may also be observed. Further manipulation, such as fusions and 
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growth in selective conditions and/or BrdU treatment or other such 
treatment, can lead to fragmentation of the megachromosome to form 
smaller chromosomes that have the amplicon as the basic repeating unit. 

If a cell with a sausage chromosome is selected, it can be treated with 
5 an agent, such as BrdU, that destabilizes the chromosome so that the 
heterochromatic arm forms a chromosome that is substantially 
heterochromatic {e.g., a megachromosome). Prior to treating the cell with 
BrdU, it can be fused with another cell line carrying chromosomes of another 
species, in order to eliminate chromosomes of the original host cell and 

10 obtain a cell in which the only chromosome from the host cell is the sausage 
chromosome. The resulting hybrid cells can be grown in the presence of 
multiple selection agents to select for those that carry the sausage 
chromosome. In situ hybridization with chromosome painting probes that 
detect chromosomes of both the host cell species and the species of cell to 

15 which the host cell was fused can provide an indication of the chromosomal 
make up of the hybrid cells. 

Cell lines containing a sausage chromosome can be treated with a 
destabilizing agent, such as BrdU, followed by growth in selective medium 
and retreatment with BrdU. The BrdU treatments appear to destabilize the 

20 genome, resulting in a change in the sausage chromosome as well. A cell 
population in which a further amplification has occurred will arise. In 
addition to the heterochromatic arm (which may, for example, be ~ 100-150 
Mb) of the sausage chromosome, an extra centromere and another (for 
example, - 1 50-250 Mb) heterochromatic chromosome arm may be formed. 

25 By the acquisition of another euchromatic terminal segment, a new 
submetacentric chromosome (e.g., megachromosome) can form. 

Megachromosomes may also be produced through regrowth and 
establishment of sausage chromosome-containing cells in selective medium. 
Repeated BrdU treatment can produce cell lines that have a dwarf 

30 megachromosome (for example, about 150-200 Mb), a truncated 
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megachromosome (for example, about 90-120 Mb), or a micro- 
megachromosome (for example, about 50-90 Mb). Cell lines containing 
smaller truncated megachromosomes can be used to generate even smaller 
megachromosomes, e.g., —10-30 Mb in size. This may be accomplished, 
5 for example, by breakage and fragmentation of a micro-megachromosome 
through exposing the cells to X-ray irradiation, BrdU or telomere-directed ]n 
vivo chromosome fragmentation. 

Apart from the euchromatic terminal segments and the integrated 
foreign nucleic acid, the whole megachromosome, as well as other related 

10 types of predominantly heterochromatic artificial chromosomes, is 

constitutive heterochromatin. This can be demonstrated by C-banding of the 
megachromosome, which results in positive staining characteristic of 
constitutive heterochromatin. It can contain tandem arrays of satellite DNA. 
In a particular example, satellite DNA blocks are organized into a giant 

15 palindrome (amplicon) carrying integrated exogenous nucleic acid sequences 
at each end. It is of course understood that the specific organization and 
size of each component can vary among species, and also the chromosome 
in which the amplification event initiates. 

In general, a clear segmentation may be observed in one or more arms 

20 of an amplification-based chromosome. For example, a megachromosome 
may contain building units that are amplicons of, for example, -30 Mb 
containing satellite DNA with the integrated "foreign" DNA sequences at 
both ends. The —30 Mb amplicons may be composed of two -15 Mb 
inverted doublets of —7.5 Mb satellite DNA blocks, which are separated 

25 from each other by a narrow band of non-satellite sequences. The wider 
non-satellite regions at the amplicon borders may contain integrated, 
exogenous (heterologous) nucleic acid, while any narrow bands of non- 
satellite DNA sequences within the amplicons may be integral parts of the 
pericentric heterochromatin of the host chromosomes. The sizes of the 

30 building units of a megachromosome or other amplification-based 
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chromosome may vary depending on the species of the host chromosome 
from which the artificial chromosome was generated. 

Further BrdU treatment can produce cell and/or calli that include cells 
with a truncated megachromosome. The megachromosome can be further 
5 fragmented in vivo using a chromosome fragmentation vector to ultimately 
produce a chromosome that comprises a smaller stable replicable unit, for 
example, about 1 5 Mb-60 Mb, containing one to four megareplicons. 

Apart from the euchromatic terminal segments, the whole 
megachromosome is heterochromatic, and has structural homogeneity. 

10 Therefore, artificial chromosomes such as the megachromosome offer a 

unique possibility for obtaining information about the amplification process, 
and for analyzing some basic characteristics of the pericentric constitutive 
heterochromatin, as a vector for heterologous DNA, and as a target for 
further fragmentation. 

15 C. Isolation of Artificial Chromosomes 

The artificial chomosomes provided herein can be isolated by any 
suitable method known to those of skill in the art. Also, methods are 
provided herein for effecting substantial purification, particularly of the 
artificial chromosomes. 

20 Artificial chromosomes, may be sorted from endogenous 

chromosomes using any suitable procedures, and typically involve isolating 
metaphase chromosomes, distinguishing the artificial chromosomes from the 
endogenous chromosomes, and separating the artificial chromosomes from 
endogenous chromosomes. Such procedures will generally include the 

25 following basic steps for animal cells and protoplasts: (1) culture of a 
sufficient number of cells (typically about 2 x 1 0 7 mitotic cells) to yield, 
preferably on the order of 1 x 10 6 artificial chromosomes, (2) arrest of the 
cell cycle of the cells in a stage of mitosis, preferrably metaphase, using a 
mitotic arrest agent such as colchicine, (3) treatment of the cells, particularly 

30 by cell wall dissolution for plant cells and/or swelling of the cells in hypotonic 
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buffer, to increase susceptibility of the cells to disruption, (4) by application 
of physical force to disrupt the cells in the presence of isolation buffers for 
stabilization of the released chromosomes, (5) dispersal of chromosomes in 
the presence of isolation buffers for stabilization of free chromosomes, (6) 
5 separation of artificial chromosomes from endogenous chromosomes and 
(7) storage (and shipping if desired) of the isolated artificial chromosomes in 
appropriate buffers. Modifications and variations of the general procedure 
for isolation of artificial chromosomes, for example to accommodate different 
cell types with differing growth characteristics and requirements and to 

10 optimize the duration of mitotic block with arresting agents to obtain the 

desired balance of chromosome yield and level of debris, may be empirically 
determined (see Examples). 

Steps 1-5 relate to isolation of metaphase chromosomes. The 
separation of artificial from endogenous chromosomes (step 6) may be 

15 accomplished in a variety of ways. For example, the chromosomes may be 
stained with DNA-specific dyes such as Hoeschst 33258 and chromomycin 
A 3 and sorted into artificial chromosomes and endogenous chromosomes on 
the basis of dye content by employing fluorescence-activated cell sorting 
(FACS). 

20 Artificial chromosomes have been isolated by fluorescence-activated 

ceil sorting (FACS). This method takes advantage of the nucleotide base 
content of the artificial chromosomes. In the case of predominantly 
heterochromatic artificial chromosomes, by virtue of their high 
heterochromatic DNA content, they will differ from any other chromosomes 

25 in a cell. In a particular embodiment, metaphase chromosomes are isolated 
and stained with base-specific dyes, such as Hoechst 33258 and 
chromomycin A3, Fluorescence-activated cell sorting will separate artificial 
chromosomes from the endogenous chromosomes. A dual-laser cell sorter 
(such as, for example, a FACS Vantage Becton Dickinson Immunocytometry 

30 Systems) in which two lasers were set to excite the dyes separately, allowed 
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a bivariate analysis of the chromosomes by base-pair composition and size. 
Cells containing such artificial chromosomes can be similarly sorted. 

Preparative amounts of artificial chromosomes (for example, 5 x 10 4 - 
5 x 10 7 chromosomes/ml) at a purity of 95% or higher can be obtained. The 
5 resulting artificial chromosomes are used for delivery to cells by methods 
such as, for example, microinjection, liposome-mediated transfer, and 
electroporation. 

Additional methods provided herein for isolation of artificial 
chromosomes from endogenous chromosomes include procedures that are 

10 particularly well suited for large-scale isolation of artificial chromosomes. In 
these methods, the size and density differences between artificial 
chromosomes and endogenous chromosomes are exploited to effect 
separation of these two types of chromosomes. To facilitate larger scale 
isolation of the artificial chromosomes, different separation techiniques may 

15 be employed such as swinging bucket centrifugation (to effect separation 
based on chromosome size and density) [see, e.g., Mendelsohn et aL (1968) 
J. Mol. Biol. 32:101-108], zonal rotor centrifugation (to effect separation on 
the basis of chromosome size and density) [see, e.g., Burki et aL (1973) 
Prep. Biochem. 3:157-182; Stubblefield et aL (1978) Biochem. Biophvs. Res. 

20 Commun. 83:1404-1414, velocity sedimentation (to effect separation on the 
basis of chromosome size and shape) [see e.g., Collard et aL (1984) 
Cytometry 5:9-191. 

Affinity-, particularly immunoaffinity-, based methods for separation of 
ACs from endogenous chromosomes are also provided herein. For example, 

25 artificial chromosomes which are predominantly heterochromatin may be 
separated from endogenous chromosomes through immunoaffinity 
procedures involving antibodies that specifically recognize heterochromatin, 
and/or the proteins associated therewith, when the endogenous 
chromosomes contain relatively little heterochromatin. 
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Immuno-affinity purification may also be employed in larger scale 
artificial chromosomes isolation procedures. In this process, large 
populations of artificial chromosome-containing cells (asynchronous or 
mitotically enriched) are harvested en masse and the mitotic chromosomes 
5 (which can be released from the cells using standard procedures such as by 
incubation of the cells, such as freshly isolated protoplasts, in hypotonic 
buffer and/or detergent treatment of the cells in conjunction with physical 
disruption of the treated cells) are enriched by binding to antibodies that are 
bound to solid state matrices (e.g. column resins or magnetic beads). 
10 Antibodies suitable for use in this procedure bind to condensed centromeric 
proteins or condensed and DNA-bound histone proteins. For example, 
autoantibody LU851 (see Hadlaczky et aL (1989) Chromosoma 97:282-288), 
which recognizes mammalian centromeres, may be used for large-scale 
isolation of chromosomes prior to subsequent separation of artificial 
15 chromosomes from endogenous chromosomes using methods such as FACS. 
The bound chromosomes would be washed and eventually eluted for sorting. 

Immunoaffinity purification may also be used directly to separate 
artificial chromosomes from endogenous chromosomes. For example, in the 
case of artificial chromosomes that are predominantly heterochromatic, the 
artificial chromsomes may be generated in or transferred to (e.g., by 
microinjection or microcell fusion as described herein) a cell line that has 
chromosomes that contain relatively small amounts of heterochromatin, such 
as hamster cells (e.g., V79 cells or CHO-K1 cells). The predominantly 
heterochromatic artificial chromosomes are then separated from the 
endogenous chromosomes by utilizing anti-heterochromatin binding protein 
(Drosophila HP-1) antibody conjugated to a solid matrix. Such matrix 
preferentially binds artificial chromosomes relative to hamster chromosomes. 
Unbound hamster chromosomes are washed away from the matrix and the 
artificial chromosomes are eluted by standard techniques. Similarly, artificial 
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chromosomes of one species, e.g., a plant-derived artificial chromosome, 
may be separated from a background of endogenous chromosomes of 
another species, e.g., animal, such as mammalian, chromosomes, based on 
immunological differences of the two species, provided that antibodies that 
5 specifically recognize one species and not the other are available or can be 
generated. 

D. Generation of Artificial Chromosomes Through Assembly of 
Component Elements 

Artificial chromosomes can be constructed in vitro by assembling the 

10 structural and functional elements that contribute to a complete chromosome 
capable of stable replication and segregation alongside endogenous 
chromosomes in cells. The identification of the discrete elements that in 
combination yield a functional chromosome has made possible the in vitro 
assembly of artificial chromosomes. The process of in vitro assembly of 

15 artificial chromosomes, which can be rigidly controlled, provides advantages 
that may be desired in the generation of chromosomes that, for example, are 
required in large amounts or that are intended for specific use in transgenic 
organism systems. 

For example, in vitro assembly may be advantageous when efficiency 

20 of time and scale are important considerations in the preparation of artificial 
chromosomes. Because in vitro assembly methods do not involve extensive 
cell culture procedures, they may be utilized when the time and labor 
required to transform, feed, cultivate, and harvest cells used in de novo cell- 
based production systems is unavailable. 

25 Provided herein are in vitro assembly methods that include the joining 

of essential components, such as a centromere, telomere and an origin of 
replication, to yield an artificial chromosome, in particular, an artificial 
chromosome that functions in plants and that may contain components 
derived from plant chromosomes. Also provided are artificial chromosomes 

30 produced by the methods. Particular embodiments of the methods and 
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chromosomes include a megreplicator. The megareplicator may contain 
rDNA, for example, mammalian or plant rDNA. In vitro assembled artificial 
chromosomes may contain any amount of heterochromatic and/or 
euchromatic nucleic acid. For example, an in vitro assembled artificial 
5 chromosome may be substantially all heterochromatin, while still containing 
protein-encoding DNA, or may contain increasing amounts of euchromatic 
DNA, such that, for example, it contains about 10%, 20%, 30%, 40%, 
50%, 60%, 70%, 80%, 90% or greater than about 90% euchromatic DNA. 
In vitro assembly may also be rigorously controlled with respect to the 

10 exact manner in which the several elements of the desired artificial 

chromosome are combined and in what sequence and proportions they are 
assembled to yield a chromosome of precise specifications. This feature is 
of particular significance in the generation of plant artificial chromosomes 
containing one or more regions of segmentation as described herein with 

15 reference to amplification-based artificial chromosomes. For example, certain 
plant chromosome structures (such as acrocentric chromosomes and/or 
chromosomes containing adjacent regions of heterochromatin and rDNA) that 
may be desirable for use in the generation of particular types of plant 
artificial chromosomes via amplification-based methods as described herein 

20 may be limited in number or may not exist. These particular types of plant 
artificial chromosomes, e.g., certain predominantly heterochromatic plant 
artificial chromosomes, may also be generated via in vitro assembly of 
artificial chromosomes as described herein. 

For example, plant artificial chromosomes containing regions of 

25 repeated nucleic acid units that are predominantly heterochromatic may be 
assembled by joining essential chromosomal components and repeat regions, 
or may be generated from an in vitro assembled artificial chromosome via 
amplification of heterochromatic DNA contained within an in vitro assembled 
artificial chromosome. For generation of such chromosomes via amplification 

30 of heterochromatic DNA contained within an in vitro assembled artificial 
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chromosome, nucleic acids are introduced into a cell containing an in vitro 
assembled artificial chromosome and a resulting cell is selected that contains 
an artificial chromosome containing one or more regions of repeated nucleic 
acid units that are predominantly heterochromatic. The in vitro assembled 
5 artificial chromosome either contains a megareplicator to faciliate 

amplification of chromosomal DNA in connection with integration of nucleic 
acid into the chromosome or megareplicator-containing DNA is included in 
the nucleic acid that is integrated into thee in vitro assembled artificial 
chromosome. 

10 The following describes the processes involved in the assembly of 

artificial chromosomes in vitro, utilizing a megachromosome as exemplary 
starting material. 

1 . Identification and isolation of the components of the artificial 
chromosome 

15 The chromosomes provided herein are elegantly simple chromosomes 

for use in the identification and isolation of components to be used in the in 
vitro assembly of expression systems or artificial chromosomes. The ability 
to purify artificial chromosomes to a very high level of purity, as described 
herein, facilitates their use for these purposes. For example, the 

20 megachromosome, particularly truncated forms thereof, serve as starting 
materials. With respect to the construction of an artificial chromosome 
containing at least some mammalian cell derived components, possible 
starting materials can be obtained from, for example, cell lines such as 1B3 
and mM2C1, which are derived from H1D3 (deposited at the European 

25 Collection of Animal Cell Culture (ECACC) under Accession No. 96040929), 
With respect to the construction of an artificial chromosome containing at 
least some plant cell derived components, possible starting materials include 
cells containing PACs, e.g., megachromosomes, generated as described 
herein. 
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For example, the mM2C1 cell line contains a micro-megachromosome 
( — 50-60 kB), which advantageously contains only one centromere, two 
regions of integrated heterologous DNA with adjacent rDNA sequences, with 
the remainder of the chromosomal DNA being mouse major satellite DNA. 
5 Other truncated megachromosomes can serve as a source of telomeres, or 
telomeres can be provided. The centromere of the mM2C1 cell line contains 
mouse minor satellite DNA, which provides a useful tag for isolation of the 
centromeric DNA. 

Additional features of particular ACs provided herein, such as the 

10 micro-megachromosome of the mM2C1 cell line, that make them uniquely 
suited to serve as starting materials in the isolation and identification of 
chromosomal components include the fact that the centromeres of each 
megachromosome within a single specific cell line are identical. The ability 
to begin with a homogeneous centromere source (as opposed to a mixture of 

15 different chromosomes having differing centromeric sequences) greatly 
facilitates the cloning of the centromere DNA. By digesting purified 
megachromosomes, particularly truncated megachromosomes, such as the 
micro-megachromosome, with appropriate restriction endonucleases and 
cloning the fragments into commercially available and well known YAC 

20 vectors (see, e.g. . Burke et aL ( 1 987) Science 236 :806-8 1 2), BAC vectors 
(see, e.g. . Shizuya et aL (1992) Proc. Natl. Acad. Sci. U.S.A. 89 : 8794- 
8797 bacterial artificial chromosomes which have a capacity of incorporating 
0.9 - 1 Mb of DNA) or PAC vectors (the PI artificial chromosome vector 
which is a PI plasmid derivative that has a capacity of incorporating 300 kb 

25 of DNA and that is delivered to coli host cells by electroporation rather 
than by bacteriophage packaging; see, e.g. , loannou et aL (1994) Nature 
Genetics 6:84-89; Pierce et aL (1992) Meth. Enzvmol. 216 :549-574: Pierce 
et aL (1992) Proc. Natl. Acad. Sci. U.S.A. 89:2056-2060; U.S. Patent No. 
5,300,431 and International PCT application No. WO 92/14819) vectors, it 
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is possible for as few as 50 clones to represent the entire micro- 
megachromosome. 

a. Centromeres 
An exemplary centromere for use in the construction of an artificial 
5 chromosome is that contained within a megachromosome, such as those 
described herein. One example of a particular megachromosome-containing 
cell line provided is, for example, H1D3 and derivatives thereof, such as 
mM2C1 cells. Megachromosomes are isolated from such cell lines utilizing, 
for example, the procedures described herein, and the centromeric sequence 

10 is extracted from the isolated megachromosomes. For example, the 
megachromosomes may be separated into fragments utilizing selected 
restriction endonucleases that recognize and cut at sites that, for instance, 
are primarily located in the replication and/or heterologous DNA integration 
sites and/br in the satellite DNA. Based on the sizes of the resulting 

1 5 fragments, certain undesired elements may be separated from the 

centromere-containing sequences. The centromere-containing DNA could be 
as large as 1 Mb. 

Probes that specifically recognize centromeric sequences, such as 
mouse minor satellite DNA-based probes [see, e.g. , Wong et aL (1988) Nucl. 

20 Acids Res. 16 :11645-11661], pCT4.2 probe, a 3.5 kb fragment of 
Arabidopsis 5S rDNA (Campbell et al. (1992) Gene 7 f 2:225-223), 
Arabidopsis cosmids E4.1 1 (30kb) adn E4.6 (33 kb, Bent et al. (1994) 
Science 255:1856-1860; and 180 bp pAL1 repeat sequence (Maluszynska et 
al. (1991) Plant J. 7:159-166; and Martinez-Zapater et al. (1986) MoL Gen. 

25 Genet. 204:417-423) may be used to isolate a centromere-containing YAC, 
BAC or PAC clone derived from the megachromosome. Alternatively, or in 
conjunction with the direct identification of centromere-containing 
megachromosomal DNA, probes that specifically recognize the non- 
centromeric elements, such as probes specific for mouse major satellite DNA, 
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plant satellite DNA, the heterologous DNA and/or rDNA, may be used to 
identify and eliminate the non-centromeric DNA-containing clones. 

Additionally, centromere cloning methods described herein may be 
utilized to isolate the centromere-containing sequence of the 
5 megachromosome. 

Once the centromere fragment has been isolated, it may be sequenced 
and the sequence information may in turn be used in PCR amplification of 
centromere sequences from megachromosomes or other sources of 
centromeres. Isolated centromeres may also be tested for function in vivo by 
10 transferring the DNA into a host cell. Functional analysis may include, for 
example, examining the ability of the centromere sequence to bind 
centromere-binding proteins. The cloned centromere will be transferred to 
cells with a selectable marker gene and the binding of a centromere-specific 
protein, such as anti-centromere antibodies ( e.g. . LU851, see, Hadlaczky et 
15 aL (1986) Exp. Cell Res. 167 :1-15) can be used to assess function of the 
centromeres. 

b. Telomeres 

Telomeres that may be used in assembly of an artificial chromosome 
include a 1 kB synthetic telomere (see, e.g., PCT Application Publication No. 

20 WO 97/40183). A double synthetic telomere construct, which contains a 1 
kB synthetic telomere linked to a dominant selectable marker gene that 
continues in an inverted orientation may be used for ease of manipulation. 
Such a double construct contains a series of TTAGGG repeats 3' of the 
marker gene and a series of repeats of the inverted sequence, i.e., GGGATT, 

25 5' of the marker gene as follows: 

(GGGATTT) n — dominant marker gene — (TTAGGG) n . Using an inverted 
marker provides an easy means for insertion, such as by blunt end ligation, 
since only properly oriented fragments will be selected. 

Telomere sequences also include sequences described in plants, for 

30 example, an Arabidopsis sequence containing head-to-tail arrays of the 
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monomer repeat CCCTAAA totaling a few, for example 3-4, kb in length. 
Telomere sequences vary in length and do not appear to have a strict length 
requirement. An example of a cloned telomere is found in GenBank 
accession no. M20158 (Richards and Ausubel (1988) Cell 53:127-136) and 
5 in U.S. Patent No, 5,270,201. Yeast telomere sequences include those 
provided in GenBank accession no. S70807 (Louis et al. (1994) Yeast 
70:271-274). Additionally, a method for isolating a higher eukaryotic 
telomere from ,4. thaliana has been reported (Richards and Ausubel (1988) 
Cell 53:127-136; and U.S. Patent No. 5,270,201). 

10 c . Megareplicator 

The megareplicator sequences, such as those containing rDNA, 
provided herein are preferred for use in artificial chromosomes generated by 
assembly of component elements In vitro. The rDNA provides an origin of 
replication and also provides sequences that facilitate amplification of the 

15 artificial chromosome in vivo to increase the size of the chromosome to, for 
example, accommodate increasing copies of a heterologous gene of interest 
as well as continuous high levels of expression of the heterologous genes, 
d. Filler heterochromatin 
Filler heterochromatin, particularly satellite DNA, is included to 

20 maintain structural integrity and stability of the artificial chromosome and 
provide a structural base for carrying genes within the chromosome. The 
satellite DNA is typically A/T-rich DNA sequence, such as mouse major 
satellite DNA, or G/C-rich DNA sequence, such as hamster natural satellite 
DNA. Sources of such DNA include any eukaryotic organisms that carry 

25 non-coding satellite DNA with sufficient A/T or G/C composition to promote 
ready separation by sequence, such as by FACS, or by density gradients. 
Examples of plant satellite DNA include, but are not limited to, satellite DNA 
of soybean (see, e.g., Morgante et al. (1997) Chromosome Res. 5:363-373; 
and Vahedian et al. (1995) Plant Mot. Biol. 25:857-862), satellite DNA on 

30 the rye B chromosome (see, e.g., Langdon et al. (2000) Genetics 754:869- 
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884) and satellite DNA in the Saccharum complex (see, e.g., Alix et at, 
(1998) Genome 47:854-864). The satellite DNA may also be synthesized by 
generating sequence containing monotone, tandem repeats of highly A/T- or 
G/C-rich DNA units. 
5 The most suitable amount of filler heterochromatin for use in 

construction of the artificial chromosome may be empirically determined by, 
for example, including segments of various lengths, increasing in size, in the 
construction process. Fragments that are too small to be suitable for use will 
not provide for a functional chromosome, which may be evaluated in cell- 

10 based expression studies, or will result in a chromosome of limited functional 
lifetime or mitotic and structural stability. 

e. Selectable marker 
Any convenient selectable marker, including specific examples 
described herein, may be used and at any convenient locus in the expression 

15 system. , 

2. Combination of the isolated chromosomal elements 
Once the isolated elements are obtained, they may be combined to 
generate the complete, functional artificial chromosome expression system. 
This assembly can be accomplished for example, by in vitro ligation either in 

20 solution, LMP agarose or on microbeads. The ligation is conducted so that 
one end of the centromere is directly joined to a telomere. The other end of 
the centromere, which serves as the gene-carrying chromosome arm, is built 
up from a combination of satellite DNA and megareplicator sequences, e.g., 
rDNA sequence, and may also contain a selectable marker gene. Another 

25 telomere is joined to the end of the gene-carrying chromosome arm. The 

gene-carrying arm is the site at which any heterologous genes of interest, for 
example, in expression of desired proteins encoded thereby, are incorporated 
either during in vitro assembly of the chromosome or sometime thereafter. 
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3. Analysis and testing of the artificial chromosome expression 
systems 

Artificial chromosomes assembled in vitro may be tested for 
functionality in cell systems, such as plant and animal cells, using any of the 
5 methods described herein for the artificial chromosomes, minichromosomes, 
or known to those of skill in the art. 

4. Introduction of desired heterologous DNA into the in vitro 
assembled chromosome 

Heterologous DNA may be introduced into the in vitro synthesized 

10 chromosome using routine methods of molecular biology, may be introduced 
using the methods described herein for the artificial chromosomes, or may be 
incorporated into the in vitro assembled chromosome as part of one of the 
synthetic elements, such as the heterochromatin. The heterologous DNA 
may be linked to a selected repeated fragment, and then the resulting 

15 construct may be amplified in vitro using the methods for such in vitro 
amplification provided herein. 

In a particular embodiment of these in vitro assembly methods, a site- 
specific recombination site is included in the assembly DNA or is added into 
the assembled chromosome, such as a plant in vitro assemble artificial 

20 chromosome, after initial assembly. The presence of a recombination site in 
the in vitro assembled artificial chromosome facilitates recombinase-catalyzed 
introduction of heterologous nucleic acid into the chromosome if the 
heterologous nucleic acid also contains a complementary recombination site. 
Such recombination systems include, but are not limited to, Cre//ox [see, 

25 e.g., Dale and Ow (1995) Gene 57:79-85], FLP/FRT [see, e.g., Nigel et ai. 
(1995) The Piant Journai 5:637-652], MRS [see, e.g., Onouchi et ai. (1991) 
Nuc. Acids Res. 73:6373-6378], Gm/gix [see, e.g., Maeser and Kahman 
(1991) MoL Gen. Genet. 250:170-176] and mXlatt. The introduction of att 
recombination sites into a chromosome and the use of lambda phage 

30 integrase recombinase in conjunction therewith to permit engineering of 

natural and artificial chromosomes is desribed in copending U.S. provisional 



WO 02/096923 



PCT7US02/17451 



-99- 

application Serial No. 60/294,758, by Perkins eta/, entitled 
"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2001, U.S. 
provisional application Serial No. 60/366,891, by Perkins eta/, entitled 
"CHROMOSOME-BASED PLATFORMS" filed on March 21, 2002, U.S. patent 
5 application Serial No. , by Perkins eta/, entitled "CHROMOSOME- 
BASED PLATFORMS" filed on May 30, 2002, under attorney docket no. 

24601-420, and PCT International Application No. , by Perkins eta/. 

entitled "CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, 
under attorney docket no. 24601 -420PC, each of which is incorporated 
10 herein in its entirety by reference thereto. Thus, also contemplated herein 
are fn vitro assembled artificial chromosomes, in particular such 
chromosomes containing plant chromosome-derived components, that 
contain one or more recombination sites, such as an att site. 

E. Methods for the Production of Plant Acrocentric Chromosomes and 
1 5 Plant Chromosomes Containing Adjacent Regions of rDNA and 

Heterochromatln 

Acrocentric human and mouse chromosomes in which the short arm 
contains only pericentric heterochromatin, an rDNA array, and telomeres can 
be used in the de novo formation of a satellite DNA based artificial 

20 chromosome (SATAC, also referred to as ACes). In some embodiments of 
the methods of producing a plant artificial chromosome provided herein, it 
may be desirable to introduce heterologous nucleic acids into a plant 
chromosome with arms of unequal length (e.g., into the short arm of an 
acrocentric chromosome) and/or containing adjacent regions of rDNA and 

25 heterochromatin, such as pericentric heterochromatin or satellite DNA. Of 
particular interest in such methods are plant acrocentric chromosomes that 
contain rDNA located adjacent to the pericentric heterochromatin or satellite 
DNA, and, in particular, on the short arm of the chromosome with little to no 
euchromatic DNA between the rDNA and the pericentric heterochromatin. 

30 Utilizing such structures as the initial composition in the generation of plant 
artificial chromosomes may facilitate generation of plant artificial 
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chromosomes that are predominantly heterochromatic. For example, 
introduction of heterologous nucleic acid into a cell containing such an 
acrocentric plant chromosome such that the nucleic acid integrates into the 
pericentric heterochromatin and/or rDNA of the short arm of the chromosome 
5 may be associated with amplification (possibly through "megareplicator" 
DNA sequences such as may reside in plant rDNA arrays, also known as the 
nucleolar organizing regions (NOR)) of heterochromatin that leads to the 
formation of a predominantly heterochromatic plant artificial chromosome. 
Naturally occurring acrocentric plant chromosomes are limited in 

10 number, and plant chromosomes with a structure that includes adjacent 
regions of heterochromatin and rDNA may not exist or may not exist for a 
variety of plant species. Provided herein are methods for generating 
acrocentric plant chromosomes and plant chromosomes containing adjacent 
regions of rDNA and heterochromatin, in particular, pericentric and/or 

15 satellite heterochromatin. Further provided herein are methods for generating 
acrocentric plant chromosomes containing adjacent regions of 
heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
and rDNA on the short arm of the chromosome. 

Also provided herein are plant acrocentric chromosomes in which the 

20 nucleic acid of one or both arms of the chromosome contains less than about 
50%, or less than about 40%, or less than about 30%, or less than about 
20%, or less than about 10%, or less than about 5%, or less than about 
2%, or less than about 1 %, or less than about 0.5% or less than about 
0.1 % euchromatin. In some embodiments of these chromosomes, the 

25 nucleic acid of only one arm, either the short arm or the long arm, contains 
less than these specified amounts of euchromatin. In a particular 
embodiment of these chromosomes, the nucleic acid of the short arm 
contains less these specified amounts of euchromatin. 

Further provided herein are plant chromosomes containing adjacent 

30 regions of heterochromatin, in particular pericentric heterochromatin or 
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satellite DMA, and rDNA with little to no euchromatin between the two 
regions. With reference to such plant chromosomes, "litte to no" means that 
the amount of euchromatic DNA, if any, located between the rDNA and 
heterochromatin (such as pericentric heterochromatin and/or satellite DNA), 
5 generally does not stain diffusely and recognizably as euchromatin and/or 
does not contain protein-encoding genes. Thus, in these chromosomes, 
between the heterochromatin (such as pericentric heterochromatin and/or 
satellite DNA) and the rDNA, there is substantially no chromatin that is less 
condensed than the heterochromatin (e.g., pericentric heterochromatin). The 

10 plant chromosomes containing adjacent regions of rDNA and 

heterochromatin (such as pericentric heterochromatin) provided herein may 
be acrocentric chromosomes. In a particular embodiment of these plant 
chromosomes, the adjacent regions of rDNA and heterochromatin, in 
particular pericentric heterochromatin, are contained on the short arm of the 

15 chromosome. 

Further provided are methods of utilizing such plant chromosomes in 
the generation of plant artificial chromosomes, and, in particular, 
predominantly heterochromatic plant artificial chromosomes, such as ACes 
(also referred to as SATACs). In particular methods of producing plant 

20 artificial chromosomes provided herein, nucleic acids are introduced into a 
cell containing a plant chromosome that is acrocentric and/or contains 
adjacent regions of rDNA and heterochromatin, such as pericentric 
heterochromatin, the cells are cultured through at least one cell division and 
a cell comprising an artificial chromosome, such as a predominantly 

25 heterochromatic artificial chromosome, is selected. In these methods, the 
plant chromosome into which nucleic acid is introduced may be an 
acrocentric chromosome containing adjacent regions of rDNA and 
heterochromatin on the short or long arm, and, in particular, on the short 
arm. 
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The plant chromosomes provided herein can be generated using site- 
specific recombination between plant chromosome regions. The regions may 
be on the same chromosome or separate chromosomes. Through site- 
specific recombination, sections of plant chromosomes may be altered to 
5 remove, invert and/or insert sequences such that a desired plant 

chromosome results. The resulting plant chromosome is acrocentric and/or 
contains adjacent regions of heterochromatic DNA and rDNA, which may or 
may not be on the short arm of an acrocentric chromosome. Thus, the 
starting chromosome in these methods may be a plant chromosome or may 

10 be a plant acrocentric chromosome that does not contain adjacent regions of 
rDNA and heterochromatin, such as pericentric heterochromatin or satellite 
DNA. If the starting chromosome is acrocentric, then it may be used in the 
generation of a plant acrocentric chromosome that contains adjacent regions 
of heterochromatic DNA {e.g., pericentric heterochromatin and/or satellite 

15 DNA) and rDNA, particularly on the short arm of the chromosome, or to 

generate a plant acrocentric chromosome in which the nucleic acid of one or 
both arms contains less than about 50%, or less than about 40%, or less 
than about 30%, or less than about 20%, or less than about 10%, or less 
than about 5%, or less than about 2%, or less than about 1 %, or less than 

20 about 0.5% or less than about 0.1% euchromatin. 

In one of the methods provided herein for producing a plant 
chromosome that is acrocentric and/or contains adjacent regions of rDNA 
and heterochromatin, nucleic acid containing a site-specific recombination 
site and nucleic acid containing a complementary site-specific recombination 

25 site are introduced into a cell containing one or more plant chromosomes. 
The nucleic acids may be introduced into the cell sequentially or 
simultaneously. The nucleic acids may also be targeted to particular 
chromosomes and/or particular sequences of a chromosome. Such targeting 
may be accomplished by including in the nucleic acids sequences 

30 homologous to particular sequences in the chromosome(s). 
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The cell is then exposed to a recombinase activity. The recombinase 
activity can be provided by introduction of nucleic acid encoding the activity 
into the cell for expression of the activity therein, or may be added to the cell 
from an exogenous source. The recombinase activity is one that catalyzes 
5 recombination between sequences at the two recombination sites. An 
appropriate recombination event produces a plant chromosome that is 
acrocentric and/or contains adjacent regions of rDNA and heterochromatin 
(such as pericentric heterochromatin and/or satellite DNA) which may be 
readily identified therein based on its particular structure {e.g., arms of 

10 unequal length if the chromosome is acrocentric) and/or other features, e.g., 
the presence of particular added sequences, such as recombination sites and 
DNA encoding a selectable marker, the absence of particular sequences, 
such as excised euchromatic DNA, and the arrangement of sequences, such 
as the placement of rDNA segments adjacent to pericentric heterochromatin 

15 and/or satellite DNA. Such attributes may be detected using techniques 

known in the art for the analysis of nucleic acids and chromosomes, such as, 
for example, in situ hybridization. 

A number of site-specific recombination systems may be used in the 
production of plant chromosomes that are acrocentric and/or contain rDNA 

20 adjacent to heterochromatin, such as pericentric heterochromatin, as 

described herein. Such systems include, but are not limited to, Cre//ox [see, 
e.g., Dale and Ow (1995) Gene 57:79-85], FLP/FRT [see, e.g., Nigel eta/. 
(1995) The Plant Journal 5:637-652], R/RS [see, e.g., Onouchi etal. (1991) 
Nuc. Acids Res. 73:6373-6378], Gin/gr/x [see, e.g., Maeser and Kahman 

25 (1991) Mol. Gen. Genet. 230:170-176] and mt/att. The introduction of att 
recombination sites into a chromosome and the use of lambda phage 
integrase recombinase in conjunction therewith to permit engineering of 
natural chromosomes is desribed in copending U.S. provisional application 
Serial No. 60/294,758 by Perkins etal. entitled "CHROMOSOME-BASED 

30 PLATFORMS" filed on May 30, 2001, U.S. provisional application Serial No. 
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60/366,891, by Perkins et aL entitled "CHROMOSOME-BASED 
PLATFORMS" filed on March 21, 2002, U.S. patent application Serial No. 

, by Perkins et al. entitled "CHROMOSOME-BASED PLATFORMS" filed 

on May 30, 2002, under attorney docket no. 24601-420, and PCT 

5 International Application No. , by Perkins et al. entitled 

"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, under 
attorney docket no. 24601 -420PC, each of which is incorporated herein in 
its entirety by reference thereto. These systems, as well as others known in 
the art, can be used to specifically excise or invert DNA (for example, in an 

10 intrachromosomal recombination), exchange regions of DNA (for example, in 
an inter-chromosomal recombination) or insert DNA (for example, through 
recombination between homologous sequences at a recombination site and 
the DNA to be inserted). The precise event is controlled by the orientation of 
the recombination site DNA sequences. 

15 In particular embodiments of the methods for producing an acrocentric 

plant chromosome provided herein, nucleic acid containing complementary 
recombinase recognition sites for site-specific recombination is introduced 
into a cell containing one or more plant chromosomes wherein one of the 
sites integrates into, or close to, the pericentric heterochromatin and/or 

20 satellite DNA (in particular, proximal satellite DNA) of one plant chromosome 
in the cell. In a further embodiment, nucleic acid containing complementary 
recombinase recognition sites for site-specific recombination is introduced 
into a cell containing one or more plant chromosomes wherein one of the 
sites integrates into the distal end of an arm of a plant chromosome in the 

25 cell. In these embodiments, recombination between the sites in the presence 
of a recombinase that recognizes the sites can result in deletion of a portion 
of an arm of a chromosome, reciprocal translocation between a distal portion 
of a chromosome arm and a more proximal portion of another chromosome 
arm or reciprocal translocation between pericentric heterochromatin and/or 

30 satellite DNA of one chromosomal arm and a more distal portion of another 
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chromosome arm. Each of these recombination events can serve to reduce 
the length of a chromosome arm and give rise to an acrocentric 
chromosome. 

In another embodiment, a nucleic acid containing a site-specific 
5 recombination site is introduced into a cell containing plant chromosomes 
wherein it integrates into the pericentric heterochromatin and/or satellite 
DNA of one plant chromosome in the cell and nucleic acid containing a 
complementary site-specific recombination site is introduced into the cell 
wherein it integrates into the distal end of an arm of another plant 

10 chromosome in the cell. In this embodiment, recombination between the 

sites in the presence of a recombinase that recognizes the sites can result in 
reciprocal translocation between the pericentric heterochromatin and/or 
satellite DNA of one chromosome and the distal portion of another 
chromosome arm thereby bringing these two regions into close proximity on 

15 one chromosomal arm and reducing the amount of DNA between the 
pericentric region of the arm and the end of the arm to generate an 
acrocentric plant chromosome. 

These methods for producing an acrocentric plant chromosome may 
also be conducted such that nucleic acid containing a site-specific 

20 recombination site is introduced into a cell containing a plant chromosome 
wherein it integrates into, or close to, the pericentric heterochromatin and/or 
satellite DNA of a plant chromosome in the cell and nucleic acid containing a 
complementary site-specific recombination site is introduced into the cell 
wherein it integrates into the distal end of the same arm of the same 

25 chromosome. In this embodiment, recombination between the sites in direct 
(i.e., the same, or head-to-tail) orientation in the presence of a recombinase 
that recognizes the sites can result in intrachromosomal recombination 
between the pericentric heterochromatin (and/or satellite DNA) and the distal 
portion of the chromosomal arm thereby excising DNA between these two 
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regions and reducing the amount of DNA between them to generate an 
acrocentric plant chromosome. 

In particular embodiments of the methods provided herein for 
producing a plant chromosome containing adjacent regions of rDNA and 
5 heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
nucleic acid containing complementary recombinase recognition sites for site- 
specific recombination is introduced into a cell containing one or more plant 
chromosomes wherein one of the sites integrates into heterochromatin of 
one plant chromosome in the cell. In a further embodiment, nucleic acid 

10 containing complementary recombinase recognitions sites for site-specific 
recombination is introduced into a cell containing one or more plant 
chromosomes wherein one of the sites integrates into rDNA or a nucleolar 
organizing region (NOR) of a plant chromosome in the cell. In these 
embodiments, recombination between the sites in the presence of a 

15 recombinase that recognizes the sites can result in deletion of DNA between 
a heterochromatic region, such as the pericentric heterochromatin (and/or 
satellite DNA), and rDNA, inversion of DNA that includes heterochromatin or 
rDNA of a plant chromosome or reciprocal translocation between 
heterochromatin of one chromosomal arm and rDNA of another chromosomal 

20 arm. Each of these recombination events can serve to arrange chromosomal 
DNA such that a region of heterochromatic DNA, such as pericentric 
heterochromatin and/or satellite DNA, is adjacent to a region of rDNA on a 
plant chromosome. 

In another embodiment, nucleic acid containing a site-specific 

25 recombination site is introduced into a cell containing plant chromosomes 
wherein it integrates into heterochromatin, such as, for example, pericentric 
heterochromatin and/or satellite DNA, of one plant chromosome in the cell 
and nucleic acid containing containing a complementary site-specific 
recombination site is introduced into the cell wherein it integrates into rDNA 

30 of another plant chromosome in the cell. In this embodiment, recombination 
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between the sites can result in reciprocal translocation between the 
heterochromatin of one chromosome and the rDNA of another chromosome 
thereby bringing these two regions into close proximity on one plant 
chromosome with little to no euchromatin between them. 
5 These methods for producing a plant chromosome containing adjacent 

regions of heterochromatic DNA and rDNA may also be conducted such that 
nucleic acid containing site-specific recombination sites is introduced into a 
cell containing a plant chromosome wherein it integrates into 
heterochromatin, for example, pericentric heterochromatin and/or satellite 

10 DNA, of a plant chromosome and nucleic acid containing a complementary 
site-specific recombination site is introduced into the cell wherein it 
integrates into rDNA of the same chromosome. In this embodiment, 
recombination between the sites in direct orientation in the presence of a 
recombinase that recognizes the sites can result in intrachromosomal 

15 recombination between heterochromatin, such as pericentric heterochromatin 
. (and/or satellite DNA), and rDNA thereby excising DNA, including 
euchromatic DNA, between these two regions. Recombination of the sites in 
indirect (i.e., head-to-head) orientation in the presence of a recombinase can 
result in inversion of DNA between the sites thereby replacing DNA, such as 

20 euchromatin, located between pericentric heterochromatin (and/or satellite 
DNA) and rDNA on the chromosome with rDNA. Thus, in the resulting plant 
chromosome, rDNA is located adjacent to pericentric heterochromatin (and/or 
satellite DNA), and DNA that was present between the pericentric 
heterochromatin (and/or satellite DNA) and the rDNA is located distal to the 

25 rDNA in a position previously occupied by the rDNA. 

In particular embodiments for producing an acrocentric plant 
chromosome containing adjacent regions of heterochromatin> such as 
pericentric heterochromatin (and/or satellite DNA), and rDNA, the short arm 
of the acrocentric chromosome may be generated in the same recombination 

30 event that places the heterochromatin and rDNA regions adjacent to each 
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other or in a separate recombination event. For example, nucleic acid 
containing a site-specific recombination site may be introduced into a cell 
containing one or more plant chromosomes wherein it integrates into the 
pericentric heterochromatin of one plant chromosome and nucleic acid 
5 containing a complementary site-specific recombination site may be 

introduced into the cell wherein it integrates into rDNA that is located at a 
distal portion of another plant chromosome or the same arm of the same of 
the same chromosome. Recombination of the sites in the presence of a 
recombinase can result in intra- or inter-chromosomal recombination that not 

10 only brings the pericentric heterchromatin (and/or satellite DNA) and rDNA 
into close proximity on one chromosomal arm, but also sufficiently reduces 
the length of that arm such that the resulting chromosome is acrocentric. 

If a single recombination event such as this does not generate an 
acrocentric plant chromosome, multiple recombination events may be used to 

15 produce an acrocentric plant chromosome containing adjacent regions of 

heterochromatic DNA and rDNA. For example, nucleic acid containing a site- 
specific recombination site may be introduced into a cell containing one or 
more plant chromosomes wherein it integrates into the pericentric 
heterochromatin (and/or satellite DNA) of one plant chromosome and nucleic 

20 acid containing a complementary site-specific recombination site may be 
introduced into the cell wherein it integrates into rDNA of the same or a 
different plant chromosome. As described abouve, recombination between 
the sites in the presence of a recombinase can result in deletion, inversion or 
reciprocal translocation of DNA to arrange chromosomal DNA such that 

25 pericentric heterochromatin (and/or satellite DNA) is adjacent to a region of 
rDNA on a plant chromosome. In order to reduce the length of the arm of 
the chromosome on which the adjacent regions of heterochromatin and rDNA 
are located, an additional recombination event can be induced by introducing 
nucleic acid containing a site-specific recombination site into a cell containing 

30 this plant chromosome wherein it integrates into a region of the chromosome 
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distal to the rDNA and nucleic acid containing a complementary site-specific 
recombination site into the cell wherein it integrates into the distal end of the 
same chromosome arm or of another plant chromosome arm. Recombination 
between the recognition sites can result in deletion or reciprocal translocation 
5 of DNA to reduce the length of the chromosome arm distal to the rDNA and 
give rise to an acrocentric plant chromosome containing adjacent regions of 
heterochromatin and rDNA on the short arm of the chromosome. 

In each of the aforementioned methods for producing a plant 
chromosome that is acrocentric and/or contains adjacent regions of 

10 heterochromatin and rDNA, the nucleic acid containing the two or more 

recombination sites may be introduced simultaneously or sequentially into a 
cell or cells using nucleic acid transfer methods described herein or known in 
the art. The nucleic acids may randomly integrate into plant chromosomes or 
may be targeted for integration into a particular region or site on a plant 

15 chromosome through homologous recombination between sequences in the 
nucleic acid and sequences within the chromosome. The recombinase 
activity may be provided by introduction of nucleic acid encoding an 
appropriate recombinase into the cell for expression therein. The 
recombinase-encoding nucleic acid may be introduced into the cell prior to, 

20 during or after introduction of nucleic acids encoding recombination sites. 

To facilitate identification of cells containing the transferred nucleic 
acids and/or in which a recombination event has occurred, nucleic acid 
encoding a selectable marker may be introduced into the cell. For example, 
one or both of the nucleic acids containing a recombination site may also 

25 contain DNA encoding a selectable marker (e.g., a resistance-encoding 
marker or a reporter molecule) operatively linked to a promoter which is 
oriented such that integration of the nucleic acid into a chromosome places 
the marker DNA between two directly oriented recombination sites on an arm 
of a chromosome. A cell containing the nucleic acid will thus be resistant to 

30 a selection agent or will detectably express a reporter molecule. Exposure of 
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the cell to the appropriate recombinase can result in a recombination event 
that excises the DNA between the two recombination sites, which includes 
DNA encoding the selectable marker. Thus, recombination could be detected 
as loss of reporter molecule expression or decreased resistance to a selection 
5 agent. After exposure to a recombinase, the cells into which nucleic 

acids containing recombination sites have been transferred may be analyzed 
for the presence of acrocentric plant chromosomes using, for example, FISH 
analysis and other chromosome visualization techniques. 

In another method provided herein for producing a plant chromosome 

10 that is acrocentric and/or contains adjacent regions of heterchromatin and 
rDNA, the recombination event or events that lead to formation of the 
chromosome occur through crossing of transgenic plants that contain 
chromosomes which contain complementary site-specific recombination 
sites. Thus, in one embodiment of these methods, nucleic acid containing a 

15 recombination site adjacent to nucleic acid encoding a selectable marker is 
introduced into a first plant cell and a first transgenic plant is generated from 
the first plant cell. Nucleic acid containing a promoter functional in a plant 
cell, a recombination site and a recombinase coding region in operative 
linkage is introduced into a second plant cell from which a second transgenic 

20 plant is generated. The first and second transgenic plants are crossed to 
obtain one or more plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker, and a resistant 
plant that contains cells comprising a plant chromosome that is acrocentric 
and/or contains adjacent regions of heterochromatin and rDNA is selected. 

25 In an example of this method, nucleic acids containing site-specific 

recombination sites are introduced into cells of Nicotiana tabacum. The 
nucleic acids are introduced separately by infecting leaf explants with 
Agrobacterium tumefaciens which carries the kanamycin-resistance gene 
(Kan R ). Kanamycin-resistant transgenic plants are generated from the 

30 infected leaf explants. One transgenic plant contains nucleic acid encoding a 
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promoterless hygromycin-resistance gene preceded by a /ox-site specific 
recombination sequence (fox~hpt), the other plant contains a cauliflower 
mosaic virus 35S promoter linked to a lox sequence and the ere DNA 
recombinase coding region (35S-/ox-c/*e). The resultant Kan R transgenic 
5 plants are crossed (see, e.g., protocols of Qin eta/. (1994) Proc. Natf. Acad. 
ScL U.S.A. 57:1706-1710, 1994). Plants in which the appropriate DNA 
recombination event has occurred are identified by hygromycin-resistance. 

The Kan R cultivars initially may be screened, such as by FISH, to 
identify two sets of candidate transgenic plants. One set has one construct 

10 integrated in regions adjacent to the pericentric heterochromatin (and/or 
satellite DNA) on the short arm of any chromosome. The second set of 
candidate plants has the other construct integrated in rDNA, such as the 
NOR region, of appropriate chromosomes. To obtain reciprocal translocation 
both sites must be in the same orientation. Therefore a series of crosses 

15 may be required, marker-resistant plants generated, and FISH analyses 

performed to identify an "acrocentric" plant chromosome or chromosomes 
that contain adjacent regions of heterochromatin. As described above, such 
an acrocentric chromosome may be used for de novo plant artificial 
chromosome formation, particularly predominantly heterochromatic plant 

20 artificial chromosomes. The selection of appropriate plant lines can be done, 

for example, using marker-assisted selection. 

F. Incorporation of Heterologous Nucleic Acids into Artificial 
Chromosomes 

Heterologous nucleic acids can be introduced into artificial 
25 chromosomes during or after formation. Incorporation of particular desired 
nucleic acids into an artificial chromosome during generation thereof may be 
accomplished by including the desired nucleic acids along with the nucleic 
acid encoding a selectable marker and any other nucleic acids used in 
artificial chromosome generation (e.g., targeting sequences that direct the 
30 heterologous nucleic acid to the pericentric region of a chromosome) in the 



WO 02/096923 



PCT/US02/17451 



-112- 

transformation of a cell to initiate amplification and formation of a artificial 
chromosomes. 

Alternatively, heterologous nucleic acids may be incorporated into an* 
artificial chromosome following formation thereof through transfection of a 
5 cell containing the artificial chromosome with the heterologous nucleic acids. 
In general, incorporation of such nucleic acids into the artificial chromosome 
is assured through site-directed integration, such as may be accomplished by 
including nucleic acids homologous or identical to DNA contained within the 
artificial chromosome in with the heterologous nucleic acid when transferring 
10 it to the artificial chromosome. An additional selective marker gene may also 
be included. 

Additionally, introduction of nucleic acids, particularly DNA molecules 
to an artificial chromosome can be accomplished by the use of site-specific 
recombinases as described herein (see, also, copending U.S. provisional 

15 application Serial No. 60/294,758 by Perkins et at. entitled "CHROMOSOME- 
BASED PLATFORMS" filed on May 30, 2001, U.S. provisional application 
Serial No. 60/366,891, by Perkins eta/, entitled "CHROMOSOME-BASED 
PLATFORMS" filed on March 21, 2002, U.S. patent application Serial No. 
, by Perkins et aL entitled "CHROMOSOME-BASED PLATFORMS" filed 

20 on May 30, 2002, under attorney docket no. 24601-420, and PCT 

International Application No. , by Perkins et aL entitled 

"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, under 
attorney docket no. 24601 -420PC; each of which is incorporated in its 
entirety by reference thereto). Artificial chromosomes can be produced 

25 containing recombinase recognition sequences, to allow the site-specific 

introduction of DNA molecules into the same. Another use for an introduced 
recombinase site is to provide a region for site-specific integration of a new 
trait by the use of recombinase mediated gene insertion. 
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G. Introduction of Artificial Chromosomes into Plant Cells and Recovery 
of Plants Containing Artificial Chromosomes 

Artificial chromosomes can be introduced into plant cells by a variety 
of methods familiar to those skilled in the art. These methods include 
5 chemical and physical methods for introduction of foreign DNA, as well as 
cell culture methods to transfer chromosomes from one cell to another cell. 

Any type of artificial chromosome can be used. Plant artificial 
chromosomes (PACs) can be prepared by the in vivo and in vitro methods 
described herein. PACs can be prepared inside plant protoplasts and then 

10 transferred to other plant species and tissues, in particular to other plant 

protoplasts,, via fusion in the presence or absence of PEG as described herein 
(Draper et al. (1982) Plant Cell Physiol. 23:451-458; Krens et al. (1982) 
Nature 72-74). PACs can be isolated from the protoplasts in which they 
were prepared, encapsulated into liposomes, and delivered to other plant 

15 protoplasts (Deshayes et at. (1985) EMBO J. 4:2731-2737). Alternatively, 
the PACs can be isolated and delivered directly to plant protoplasts, plant 
cells, or other plant targets via a PEG-mediated process, calcium phosphate- 
mediated process, electroporation, microinjection, (particle bombardment), 
lipid-mediated method with or without sonoporation, sonoporation alone, or 

20 any method known in the art as described herein (Haim et aL (1985) Mol. 

Gen. Genet. 199:161-168; Fromm et aL (1986) Nature 319:791-793; Fromm 
eta/. (1985) Proc. Nat. Acad. Sci. USA 82:5824-5828; Klein et aL (1987) 
Nature 327:70; Klein et aL (1988) Proc. Nat. Acad. Sci. USA 85:8502-8505; 
and International PCT application publication no. WO 91/00358). Plant 

25 artificial chromosomes can also be transferred to other plant species by 
preparation of protoplast-derived plant microcells, and fusion of the 
microcelts containing the plant artificial chromosome with plant cells of other 
plant species. 

Mammalian artificial chromosomes (MACs) can be transferred to plant 
30 cells. Mammalian artificial chromosomes are prepared by the in vivo and in 
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vitro methods described in US Patent Nos. 6,025,155 and 6,077,697, and 
International PCT application No. WO 97/40183. MACs can be prepared as 
microcells, and the microcells can be fused with plant protoplasts in the 
presence or absence of PEG (Dudits et al. (1976) Hereditas 82:121-123; 
5 Wiegland et al. (1987) J. Cell. Sci. Pt. 2 145-149). Alternatively, the MACs 
can be isolated and delivered directly to plant cells, protoplasts, and other 
plant targets using a PEG-mediated process, calcium phosphate-mediated 
process, electroporation, microinjection, lipid-mediated method with or 
without sonoporation, sonoporation alone, or any method known in the art as 

10 described herein and in US Patent Nos. 6,025,155 and 6,077,697, and 
International PCT application publication No. WO 97/40183. 

After PACs or MACs are introduced into plant targets and the plant 
targets are grown and analyzed for transfection, the plant transformed plant 
targets can be developed using standard conditions into roots, shoots, 

15 plantlets, or any structure capable of growing into a plant. 

Accordingly, methods for the introduction of artificial chromosomes 
represent the first step in the production of plant cells and whole plants 
containing artificial chromosomes from a variety of sources. 

The ability to introduce genes into plants, such that they are stably 

20 expressed and transmissible from generation to generation, has 

revolutionized plant biology and opens up new possibilities for using plants 
as green factories for the production of commercially useful products as well 
as for other applications described herein. There are several approaches to 
the generation of stably transformed plants, and the adopted approach varies 

25 according to the aims of the project. For introduction of artificial 
chromosomes into plants, a variety of methods may be employed, 
transgenic plants, the transformation process involves the methods of foreign 
DNA delivery to plant host cells, the growth and analysis of transformed 
plant host cells, and the generation and regeneration of transgenic plants 

30 from transformed plant host cells. 
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1 . Introduction of artificial chromosomes into plant host cells 
Numerous methods for producing or developing transgenic plants are 
available to those of skill in the art. The method used is primarily a function 
of the species of plant. Artificial chromosomes containing heterologous 
5 DNA, such as artificial chromosomes prepared by the methods described 
herein, can be introduced into plant host cells, including, but not limited to, 
plant cells and protoplasts, by, for example, non-vector mediated DNA 
transfer processes (see, also copending U.S. application Serial No. 
09/815,979, which describes methods for delivery that can be adapted for 

10 use with plant cells and used with plant protoplasts}. 

Non-vector mediated, or direct, gene transfer systems involve the 
introduction of heterologous DNA, in particular artificial chromosomes, into 
host cells, including but not limited to plant cells and protoplasts, without the 
use of a biological vector. The artificial chromosome that is introduced into 

15 these plant host cells can lead to the development of transformed, 
regenerable transgenic plants. The direct gene transfer systems for 
transgenic plants are designed to overcome the barrier to DNA uptake 
caused by the cell wall and the plasma membrane of plant cells. The 
approaches for direct gene transfer include, but are not limited to, chemical, 

20 electrical, and physical methods, which can also be adapted to optimize 
transfer of artificial chromosomes (see, e.g. , Uchimiya et aL (1989) J. of 
Biotech. 12: 1-20 for a review of such procedures, see also, e.g. , U.S. 
Patent Nos. 5,436,392; 5,489,520; Potrykus et aL (1985) MoL Gen. Genet. 
753:183; Lorz et aL (1985) MoL Gen, Genet 199: 178; Fromm et aL (1985) 

25 Proc. Natl. Acad. ScL U.S.A. 52:5824-5828; Uchimiya et aL (1986) MoL 

Gen. Genet. 204:204; Callis et aL (1987) Genes Dev. 7:1183-2000; Callis et 
aL (1987) Nuc. Acids Res. 75:5823-5831; Marcotte et aL (1988) Nature 
355:454 and Toriyama et aL (1988) Bio/Technology 6:1072-1074). 
a. Chemical methods 
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Uptake of artificial chromosomes into plant cells, such as protoplasts, 
can be accomplished in the absence or presence of polyethylene glycol 
(PEG), which is a fusogen, or by any variations of such methods known to 
those of skill in the art [see # e.g. , U.S. Patent No. 4,684,61 1 to Schilperoot 
5 et ah; Paskowski et al. (1984) EMBO J. 3:2717-2722; U.S. Patent Nos. 
5,231,019 and 5,453,367], In one approach, plant protoplasts are 
incubated with a solution of foreign DNA, in particular artificial 
chromosomes, and PEG at a concentration .that allows for high cell survival 
and high efficiency chromosome uptake. The protoplasts are then washed 

10 and cultured [Datta and Datta (1999) Meth. in Molecular Biol. 1 1 1:335-348]. 
In an alternative approach, plant protoplasts are incubated with artificial 
chromosomes in the presence of calcium phosphate for direct artificial 
chromosome uptake (Haim et al. (1985) Mol. Gen. Genet. 199:161-1 68). 
Alternatively, the artificial chromosome, in particular plant artificial 

15 chromosome (PAC), is formed in a plant protoplast which is, in turn, fused 
with another plant protoplast in the presence or absence of PEG to transfer 
the PAC to the plant host protoplast. Such methods for treating protoplasts 
with PEG and foreign DNA are well known in the art (Draper et al. (1982) 
Plant Cell Physiol. 23:451-458; Krens et al. (1982) Nature 72-74). 

20 Another chemical direct gene transfer method involves lipid-mediated 

delivery of artificial chromosomes to plant protoplasts. In this process, 
liposomes with encapsulated artificial chromosomes are allowed to fuse with 
protoplasts alone or in the presence of PEG as the fusogen to transfer the 
foreign DNA, in particular artificial chromosome, to the plant host protoplast 

25 (Deshayes et al. (1985) EMBO J. 4:2731-2737; Fraley and Paphadjopoulos 
(1982) CurrTop Microbiol Immunol 96:171-191). 

Another direct gene transfer method involves the use of microcells. 
The chromosomes can be transferred by preparing microcells containing 
artificial chromosomes and then fusing the microcells with plant protoplasts. 

30 Methods for the preparation and fusion of microcells with other cells are well 
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known in the art (see Example No. 4 and see also, e.g. , U.S. Patent Nos. 
5,240,840; 4,806,476;5, 298,429; 5,396,767; Fournier (1981) Proc. Natl. 
Acad. Sci. U.S.A. 78 :6349-6353; and Lambert et ah (1991) Proc. Natl. 
Acad. Sci. U.S.A. 88:5907-59: Dudits eta/. (1976) Hereditas 82:121-123; 
5 Wiegland et aL (1987) J. Cell. Sci. Pt. 2 145-149). 
b. Electrical methods 
Electroporation, which involves high-voltage electrical pulses to a solution 
containing a mixture of protoplasts or plant cells and foreign DNA, in 
particular artificial chromosomes, to create nanometer-sized, reversible pores, 

10 is a common method to introduce DNA into plant cells or protoplasts. The 
exogenous DNA may be added to the protoplasts in any form such as, for 
example, naked linear, circular or supercoiled DNA, artificial chromosomes 
encapsulated in liposomes, DNA in spheroplasts, artificial chromosomes in 
other plant protoplasts, artificial chromosomes complexed with salts, and 

15 other methods. The foreign DNA, in particular artificial chromosome, can also 
include a phenotypic marker to identify plant cells that are successfully 
transformed. 

When plant cells or protoplasts are subjected to short electrical DC (direct 
current) pulses, they may experience an increase in the permeability of the 

20 plasma membrane and/or cell wall to hydrophilic molecules such as nucleic 
acids, which are normally unable to enter the plant cell directly. Nucleic 
acids are taken directly into the cell cytoplasm either through these pores or 
as a consequence of the redistribution of membrane components that 
accompanies closure of the pores. Certain cell wall-degrading enzymes, such 

25 as pectin-degrading enzymes, may be employed to render the plant target 
recipient cells more susceptible to DNA or artificial chromosome uptake by 
electroporation than untreated cells. Plant recipient cells may also be 
susceptible to transformation by mechanical wounding. To effect 
transformation by electroporation, friable tissues such as a suspension 

30 culture of cells or embryonic callus may be used or immature embryos or 
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other organized tissues may be directly transformed (see, e.g., Fromm eta/. 
(1986) Nature 373:791-793). Methods for effecting electroporation are well 
known in the art (see, e.g. . U.S. Patent Nos. 4,784,737; 4,970,154; 
5,304,486; 5,501,967; 5,501,662; 5,019,034; 5,503,999; see, also Fromm 
5 et aL (1 985) Proc. Natl. Acad. Sci. U.S.A. 82:5824-5828; Zimmerman et al. 
(1981) Biophys Biochem Acta 641:160-165; Neuman et al. (1982) EMBO J. 
1:841-845; Riggs et al. (1986) Proc. Nat. Acad. Sci. USA 83:5602-5606; 
Lurquin (1997) Mol. Biotechnol. 7:5-35; Bates (1999) Methods in Molecular 
Biology 1 1 1:359-366). Electroporation can be used to introduce nucleic 

10 acids into tobacco mesophyll cells (Morikawa et aL (1986) Gene 41:121- 
124; leaf bases of rice (Dekeyser et aL (1990) Plant Cell 2:591-602; 
immature maize embryos (Songstad et al. (1993) Plant Cell Tiss. Orgn. Cult. 
40:1-15; macerated immature maize embryos (D'Halluin et aL (1992) Plant 
Cell 4:1495-1505; suspension cultured maize cells (Laursen et al. (1994) 

15 Plant Mol. Biol. 24: 51-61; and sugar cane (Arencibia et al. (1995) Plant Cell 
Rep. 14:305-309). 

Artificial chromosomes may be delivered to plant cells, in particular 
plant seeds, by the use of electroporation and pollen to derive pollen 
comprising an artificial chromosome. Methods that may be used for delivery 

20 of artificial chromosomes into pollen include, for example, techniques 
described in U.S. Patent No. 5,049,500 and by Negrutiu et al. [in 
Biotechnology and Ecology of Pollen, Mulcahy et al. eds., (1986) Springer 
Verlag, N.Y., pp. 65-69] and Fromm et al. [(1986) Nature 319:791; including 
methods for introducing DNA into mature pollen using various procedures 

25 such as heat shock, PEG and electroporation]. The pollen is capable of 
germinating and fertilizing an egg cell, leading to the formation of a plant 
seed comprising an artificial chromosome, 
c. Physical methods 
The physical methods approach for introducing foreign DNA, in 

30 particular artificial chromosomes , into plant cells overcomes the cell wall 
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barrier to DNA movement. Physical, or mechanical means, are used to 
introduce transgenes directly into protoplasts or plant cells and include, but 
are not limited to, microinjection, particle bombardment, and sonoporation. 

(1) Microinjection 

5 Microinjection involves the mechanical injection of heterologous DNA, 

in particular artificial chromosomes, into plant cells, including cultured cells 
and cells in intact plant organs and embryoids in tissue culture via very small 
micropipettes, needles, or syringes (Neuhaus etaL (1987)Theor. Appl Genet. 
75:30-36; Reich etal. (1986) Can. J. Bot. 64:1255-1258; Crossway etaL 

10 (1986) BioTechniques 4:320-334; Crossway etaL (1986) Mol. Gen. Genet. 
20:179; U.S. Patent No. 4,743,548; silicon carbide whiskers (Kaeppler et 
aL (1990) Plant Cell Rep. 9:415-418; Frame etaL (1994). For example, 
microinjection of protoplast cells with foreign DNA for transformation of plant 
cells has been reported for barley and tobacco (see, e.g., Holm etaL (2000) 

15 Transgenic Res. 9:21-32 and Schnorf etaL Transgenic Res. 7:23-30). Single 
artificial chromosomes may be front-loaded into microinjection needles and 
then injected into cells ("pick-and-inject") following procedures as described 
by Co etaL [(2000) Chromosome Res. 8:183-191]. 

(2) Particle bombardment 

20 Microprojectile bombardment (acceleration of small high density 

particles, which contain the DNA, to high velocity with a particle gun 
apparatus, which forces the particles to penetrate plant cell wails and 
membranes)have also been used to introduce heterologous DNA into plant 
cells. Microprojectile bombardment techniques for the introduction of nucleic 

25 acids into plant cells, in addition to being an effective means of reproducibly 
stably transforming plant cells, particularly monocots, do not require isolation 
of protoplasts or susceptibility of the host cell to Agrobacterium infection. In 
these methods, nucleic acids are carried through the cell wall and into the 
cytoplasm on the surface of small, typically metal, particles (see, e.g., Klein 

30 etai. (1987) Nature 327:70; Klein etaL (1988) Proc. Natl. Acad. Sci. U.S.A. 
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55:8502-8505, Klein eta/, in Progress in Plant Cellular and Molecular 
Biology, eds. Nijkamp, H.J.J., Van der Plas, J.H.W., and Van Aartrijk, J., 
Kluwer Academic Publishers, Dordrecht, (1988), p. 56-66 and McCabe eta/. 
(1988) Bio/Technology 6:923-926; Sautter etaL (1991) Biol. Technol. 
5 9:1080-1085; Gordon-Kamm et al. (1990) Plant Cell 2:603-618; Finer etaL 
(1999) Curr. Top. Microbiol. Immunol. 240:59-80; Vasil and Vasil (1999) 
Methods in Molecular Biology 111:349-358; Seki etaL (1999) Mo. 
Biotechnol. 1 1:251-255). Particles may be coated with nucleic acids and 
delivered into cells by a propelling force. Exemplary particles include those 

10 containing tungsten, gold or platinum, as well as magnesium sulfate crystals. 
The metal particles can penetrate through several layers of cells and thus 
allow the transformation of cells within tissue explants. 

In an illustrative embodiment (see, e.g., U.S. Patent No. 6,023,013) of 
a method for delivering foreign nucleic acids into plant cells, e.g., maize 

15 cells, by acceleration, a Biolistics Particle Delivery System may be used to 
propel particles coated with DNA or cells through a screen, such as a 
stainless steel or Nytex screen, onto a filter surface covered with plant (e.g., 
corn) cells cultured in suspension. The screen disperses the particles so that 
they are not delivered to the recipient cells in large aggregates. The 

20 intervening screen between the projectile apparatus and the cells to be 

bombarded may reduce the size of projectile aggregates and may contribute 
to a higher frequency of transformation by reducing damage inflicted on the 
recipient cells by projectiles that are too large. 

For the bombardment, cells in suspension may be concentrated on 

25 filters or solid culture medium. Alternatively, immature embryos or other 
plant target cells may be arranged on solid culture medium. The cells to be 
bombarded are typically positioned at an appropriate distance below the 
microprojectile stopping plate. If desired, one or more screens may also be 
positioned between the acceleration device and the cells to be bombarded. 
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The prebombardment culturing conditions and bombardment 
parameters may be optimized to yield the maximum numbers of stable 
transformants. Both the physical and biological parameters for bombardment 
are important in this technology. Physical factors include those that involve 
5 manipulating the DNA/microprojectile precipitate or those that affect the 

flight and velocity of either the macro- or microprojectiles. Biological factors 
include all steps involved in manipulation of cells before and immediately 
after bombardment, the osmotic adjustment of target cells to help alleviate 
the trauma associated with bombardment, and also the nature of the 

10 transforming nucleic acid, such as linearized DNA, intact supercoiled 
plasmid§, or artificial chromosomes. 

Physical parameters that may be adjusted include gap distance, flight 
distance, tissue distance and helium pressure. In addition, transformation 
may be optimized by adjusting the osmotic state, tissue hydration and 

15 subculture stage or cell cycle of the recipient cells. Ballistic particle 

acceleration devices are available from Agracetus, Inc. (Madison, Wl) and 
BioRad (Hercules, CA). 

Techniques for transformation of A1 88-derived maize line using 
particle bombardment are described in Gordon-Kamm et al. (1990) Plant Cell 

20 2:603-618 and Fromm et al. (1990) Biotechnology S:833-839. 

Transformation of rice may also be accomplished via particle bombardment 
(see, e.g., Christou et al. (1991) Biotechnology 9:957-962). Particle 
bombardment may also be used to transform wheat (see, e.g., Vasil et al. 
(1992) Biotechnology 70:667-674 for transformation of cells of type C long- 

25 term regenerate callus; and Weeks et al. (1993) Plant Physiol. 102\\Q11- 
1 084 for transformation of wheat using particle bombardment of immature 
embryos and immature embryo-derived callus). The production of transgenic 
barley using bombardment methods is described, for example, by Koprek et 
al. (1996) Plant ScL //9:79-91. 

30 (3) Sonoporation 
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Foreign DNA, in paticular artificial chromosomes, may be introduced 
into plant protoplasts using ultrasound treatment, in particular mild 
ultrasound treatment (10-100kHz), to create pores for DNA uptake (see e.g. 
International PCT application publication no. WO 91/00358) or may be 
5 introduced into plant protoplasts via a sonoporation machine (ImaRx 
Pharmaceutical Corp., Tucson, AZ). 

Alternatively, the delivery of artificial chromosomes into plant host 
cells is performed by any method described herein or well known in the art. 
For example, needle-like whiskers (US 5,302,523, 1994, US 5,464,765) 

10 have been used to delivery foreign DNA. 

Suitable plant targets into which foreign DNA, in particular artificial 
chromosomes, is transferred include, but are not limited to, protoplasts, cell 
culture cells, cells in plant tissue, meristem cells, microspores, callus, pollen, 
pollen tubes, microspores, egg-cells, embryo-sacs, zygotes or embryos in 

15 different stages of development, seeds, seedlings, roots, stems, leaves, 
whole plants, algae, or any plant part capable of proliferation and 
regeneration of plants, (see, e.g., U.S. Patent Nos. 5,990,390; 6,037,526 
and 5,990,390). The growth of the transformed plant targets described 
herein can done with tissue-culture or non-tissue culture methods, with the 

20 preferred methods being tissue culture methods. 

All plant cells into which foreign DNA, in particular artificial 
chromosomes, are introduced and that is regenerated from the transformed 
cells are used directly for expressed purposes (e.g. herbicide resistance, 
insect/pest resistance, disease resistance, environmental/stress resistance, 

25 nutrient utilization, male sterility, improved nutritional content, production of 
chemicals or biologicals, non-protein expressing sequences, and preparation 
and screening of libraries) as described herein or are used to produce 
transformed whole plants for the applications and uses described herein. The 
particular protocol and means for the introduction of the artificial 
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chromosome into the plant host is adapted or refined to suit the particular 
plant species or cultivar. 

Chromosomes may be transferred to cells by microcell mediated 
chromosome transfer (MMCT) (Telenius et ai., Chromosome Research 7:3-7, 
5 1999; Ramulu et al.. Methods in Molecular Biology 111: 227-242, 1999). In 
general, donor plant cultures or donor mammalian cell cultures are incubated 
in media supplemented with reagents that inhibit DNA synthesis (e.g., 
hydroxy urea, aphidicolin) and/or reagents that inhibit attachment of 
chromosomes to the mitotic spindle (e.g.,colcemid, colchicines, amiprophos- 

10 methyl, cremart). The cell walls of plant cells are digested with enzymes 
(e.g., cellulase, maceroenzyme) producing protoplasts. Donor plant 
protoplasts or donor mammalian cells are loaded on a Percoll gradient in the 
presence of cytochalasin-B (which causes the cell cytoskeleton to 
depolymerize into monomer protein subunits) and centrifuged at 10 5 x g. 

1 5 During centrif ugation the metaphase chromosomes are extruded through the 
plasma membrane forming plant 'microprotoplasts' or mammalian 
'microcells.' The microprotoplasts/microcells are filtered through nylon 
sieves of decreasing pore size (8-3 jjm) to isolate smaller ones that contain 
predominately 1 metaphase chromosome. The microprotoplasts/microcells 

20 are fused to recipient plant protoplasts or mammalian cells by polyethelene 
glycol (peg) treatment. The fusion mixture is cultured in appropriate media. 
If the chromosome of interest is expressing a selection marker gene the 
fusion mixtures may be cultured in appropriate media supplemented with the 
appropriate selection drug (e.g. hygromycin, kanamycin). 

25 2. The growth of transformed plant host cells 

In tissue culture methods, plant cells or protoplasts transformed by the 
chemical, physical, electrical methods described herein are grown, or 
cultured, under selective conditions. The selective markers are integrated 
into the heterologous DNA, in particular artificial chromosome, before its 

30 introduction to plant hosts or are integrated into the plant host after 
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transfection. An additional marker can be used for double selection. 
Generally, the plant cells or protoplasts are grown for numerous generations, 
after which the transformed cells are identified. 

The transformed cells are subjected to conditions known in the art for 
5 callus initiation. Tissue that develops during the initiation period is placed in 
a regeneration or selection medium where shoot and root development occur. 
The plantlets are analyzed for the determination of transformation 
(International PCT application publication no. WO 00/60061). In the case of 
maize, embryonic callus cultures are initiated from immature maize embryos, 

10 bombarded with genes, and transformed into plantlets by the methods 

described in International PCT application publication no. WO 00/60061. In 
tissue culture methods, Rice calli are transformed with DNA encoding 
insecticidal proteins CrylA(b) and CrylA(c) for insect resistance. Common 
tissue culture methods can also be used to transform tobacco and tomato 

15 (see, e.g., US Patent No. 6,136,320), embryogenic maize calli (US Pat. Nos. 
5,508,468; 5,538,877; 5,538,880; 5,780,708; 6,013,863; 5,554,798; 
5,990,390; and 5,484,956;) and other crop species, e.g., potato and 
tobacco (Sijmons et al. (1990) Bio/Technol 8:21 7-221; tobacco 
(Vanderkerckhove et al. (1989) Bio/Technol 7:929-932 and Owen and Pen 

20 eds. Transgenic Plants: A Production System for Industrial and 

Pharmaceutical Proteins, John Wiley & Sons, Chichester, 1996) and rice 
(Zhu et al. (1994) Plant Cell Tiss Org Cult 36:197-204). 
3. Analysis of transformed plant host cells 

Once foreign DNA, in particular artificial chromosomes, is introduced 
25 into plant hosts and the cells or protoplasts are grown and developed under 
the conditions described herein, the plant cells or protoplasts which were 
transformed with artificial chromosomes are identified. The plant cell, 
protoplast, callus, leaf disc, or other plant target are screened for the 
presence of artificial chromosomes by various methods well known in the art 
30 including, but not limited to, assays for the expression of reporter genes, 
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PCR of the isolated plant chromosomes or DNA, electron microscopy, 
visualization methods, and in situ hybridization of chromosome painting 
probe as described herein. Moreover, cells treated with artificial 
chromosomes are isolated during metaphase using a mitotic arrest agent, 
5 such as colchicine, and the artificial chromosome are distinguished from 
endogenous chromosomes by fluorescence-activated cell sorting, size and 
density differences, or by any method well known in the art. Alternatively, 
when a selectable marker gene is transmitted with or as part of the artificial 
chromosome, selective agents are used to detect the expression of the 

10 selectable marker (International PCT application publication no. WO 

00/60061; US Patent No. 6,136,320; Owen and Pen Eds. Transgenic Plants: 
A Production System for Industrial and Pharmaceutical Proteins). Enzymatic 
assays, immunological assays, bioassays, germination assays, or chemical 
assays are used to assess the phenotypic effects of artificial chromosomes 

15 such as insect or fungal resistance or any other expression of genes in 

artificial chromosomes (Cheng et al. (1998) 95:2767-2772; US Patent No. 
6,126,320; International PCT application publication no. WO 00/60061; 
Owen and Pen eds. Transgenic Plants: A Production System for Industrial 
and Pharmaceutical Proteins, John Wiley & Sons, Chichester, 1996). The 

20 plant cells, protoplasts, or other plant hosts that are successfully transformed 
with artificial chromosomes are used directly to express the gene of interest 
or are used to generate transgenic plants. 

Fluorescent in situ hybridization (FISH) may be used to screen for the 
transfer of artificial chromosomes into plant cells. Using DNA probes specfic 

25 for the artificial chromosome (e.g., mouse major satellite DNA probe for 
murine satellite DNA based artificial chromosomes; or a kanamycin, 
hygromycin or GUS gene DNA probe for a plant artificial chromosome 
carrying such a gene) standard FISH techniques for plant cells have been 
described (de Jong et al., Trends in Plant Science 4: 258-263, 1999). 
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IdU labeling can be used to determine the optimum conditions for 
chromosome transfer (microcells) or isolated artificial chromosomes. The 
incorporated IdU increases the fragility of the chromosome and will increase 
the probability of cellular mutation. Hence, the cells are fixed within 48- 
5 hours after transf ection/fusion and analyzed for chromosome uptake using 
various procedures. Once the optimum transfer conditions have been 
determined, long-term expression experiments are performed with unlabeled 
artificial chromosomes or microcells. 
H. Re-generation of transgenic plants 

10 Plants containing artificial chromosomes are generated from plant 

cells, protoplasts, calli, or other plant tissue targets into which foreign DNA, 
in particular artificial chromosomes, have been introduced. Regeneration 
techniques for many commercially important plant species are well-known in 
the art. The artificial chromosome that is inserted into plant hosts to 

1 5 produce transgenic plants are PACs or MACs. 

Plants are re-generated by the planting of transformed roots, plantlets, 
seeds, seedlings and structures capable of growing into a whole plant 
capable of reproduction (see, e.g., US Patent Nos. 6,136,320 and 
International PCT application No. WO 00/60061). The re-generation of maize 

20 plants from transformed protoplasts is found, for example, in European 
Patent Application nos. 0 292 435 and 0 392 225 and International PCT 
Application Publication no. WO 93/07278; the regeneration of rice following 
gene transfer is found in Zhang et al. (1988) Plant Cell Rep. 7:379-384; 
Shimamoto et al. (1989) Nature 338:274-277; Datta et al. (1990) 

25 Biotechnology 5:736-740; and the re-generation of fertile transgenic barley 
by direct DNA transf er to protoplasts is described by Funatsuki et al. (1995) 
Theor. Appl. Genet. 37:707-712. Alternatively, plants containing artificial 
chromosomes are obtained by crossing a plant containing an artificial 
chromosome with another plant to produce plants having an artificial 

30 chromosome in their genomes (see e.g. US Patent No. 6,1 50,585). 
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Plants containing an artificial chromosome are propagated through 
seed, cuttings, or vegetatively. The seed from plants containing an artificial 
chromosome are grown in the field, in pots, indoors, outdoors, in 
greenhouses, on glass, or in or on any suitable medium, and the resulting 
5 sexually mature transgenic plants are self-pollinated to generate true breeding 
plants. The progeny from these transgenic plants become true breeding lines 
(International PCT application publication Nos. WO 00/60061 and EP 
1017268; US Patent Nos. 5,631,152; 5,955,362; 6,015,940; 6,013,523; 
6,096,546; 6,037,527; 6,153,812; Weissbach and Weissbach (1988) 
10 Methods for Plant Molecular Biology, Academic Press, Inc.; Fromm et af. 
(1990) Bio/Technology 8:833-839; Gordon-Kamm et af. (1990) Plant Cell 
2:603-608; Koziel et at. (1993) Bio/Technology 1 1:194-200; and Golovkin et 
af. (1993) Plant Sci. 90:41-52). 
I.PACs 

15 Plant artificial chromosomes (PACs) are prepared by the in vivo and in 

vitro methods described herein. PACs may be prepared inside plant 
protoplasts and then transferred to plant targets, in particular to other plant 
protoplasts, via fusion in the presence or absence of PEG as described herein 
(Draper et af. (1982) Plant Cell Physiol. 23:451-458; Krens et aL (1982) 

20 Nature 72-74). PACs are isolated from the protoplasts in which they were 
prepared, encapsulated into liposomes, and delivered to other plant 
protoplasts (Deshayes et af. (1985) EMBO J. 4:2731-2737). Alternatively, 
the PACs are isolated and delivered directly to plant protoplasts, plant cells, 
or other plant targets via a PEG-mediated process, calcium phosphate- 

25 mediated process, electroporation, microinjection, sonoporation, or any 

method known in the art as described herein (Haim et af. (1985) Mol. Gen. 
Genet. 199:161-168; Fromm et af. (1986) Nature 319:791-793; Fromm et 
af. (1985) Proc. Nat. Acad. Sci. USA 82:5824-5828; Klein et af. (1987) 
Nature 327:70; Klein et af. (1988) Proc. Nat. Acad. Sci. USA 85:8502-8505; 

30 and International PCT application publication no. WO 91/00358). 
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2. MACs 

Mammalian artificial chromosomes (MACs) are prepared by the in vivo 
and in vitro methods described in US Patent Nos. 6,025,155 and 6,077,697, 
and International PCT application No. WO 97/40183. MACs are prepared as 
5 microcells, and the microcells are fused with plant protoplasts in the 

presence or absence of PEG (Dudits et al. (1976) Hereditas 82:121-123; 
Wiegland et al. (1987) J. Cell. Sci. Pt. 2 145-149). Alternatively, the MACs 
are isolated and delivered directly td plant cells, protoplasts, and other plant 
targets a PEG-mediated process, calcium phosphate-mediated process, 

10 electroporation, microinjection, sonoporation , or any method known in the 
art as described herein and in US Patent Nos. 6,025,155 and 6,077,697, 
and International PCT application publication No. WO 97/40183. 

After PACs or MACs are introduced into plant targets and the plant 
targets are grown and analyzed for transfection, the transformed plant 

15 targets are developed using standard conditions into roots, shoots, plantlets, 
or any structure capable of growing into a plant. Transgenic plants can, in 
turn, be generated by the planting of transformed roots, plantlets, seeds, 
seedlings and structures capable of growing into a plant. Transgenic 
plants can be propagated, for example, through seed, cuttings, or vegetative 

20 propagation. 

i. Applications and Uses of Artificial Chromosomes 

Artificial chromosomes provide convenient and useful vectors, and in 
some instances (e.g., in the case of very large heterologous genes) the only 
vectors, for introduction of heterologous genes into hosts. Virtually any 

25 gene of interest is amenable to introduction into a host via artificial 
chromosomes. 

As described herein, there are numerous methods for using artificial 
chromosomes to introduce coding sequences into plant cells. These include 
methods for using artificial chromosomes to express genes encoding 
30 commerically valuable enzymes and therapeutic compounds in plant cells, 
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introduction of agronomically important traits or applications related to the 
manipulation of large regions of DNA. 

The artificial chromosomes provided herein may be used in methods of 
protein and gene product production, particularly using plant cells as host 
5 cells for production of such products, and in cellular production systems in 
which the artificial chromosomes provide a reliable, stable and efficient 
means for optimizing the biomanufacturing of important compounds for 
medicine and industry. They are also intended for use in methods of gene 
therapy and for production of transgenic organisms, particularly plants 
10 (discussed above, below and in the EXAMPLES). 

1 . Production of products in plants 

Methods for expression of heterologous proteins in plant cells 
("molecular farming") are provided. At present, many foreign proteins have 
been expressed in whole plants or selected plant organs. Plants can offer a 

15 highly effective and economical means to produce recombinant proteins as 
they can be grown on a large scale at modest cost. The production of 
heterologous proteins in plants has included genes that are fused to strong 
constitutive plant promoters (e.g., 35S from cauliflower mosaic virus 
(Sijmons et al., 1990, Bio/Technology, 8:217-221, Benfey and Chua, US 

20 5,1 10,732, Fraley et al., US 5,858,742, McPherson and Kay, US 

5,359,142); seed specific promoters (Hall et al., US 5,504,200, Knauf et al., 
US 5,530,194, Thomas et al., US 5,905,186, Moloney, US 5,792,922, US 
5,948,682) or promoters active in other plant organs such as fruit (Radke et 
al., 1988, Theoret. Appl. Genet., 75:685-694, Bestwick et al., US 

25 5,783,394, Houck and Pear, US 4,943,674) or storage organs such as 

tubers (Rocha-Sosa et al., US 5,436,393, US 5,723,757). The genes under 
the control of these promoters can be any protein and include, for example, 
genes that encode receptors, cytokines, enzymes, proteases, hormones, 
growth factors, antibodies, tumor suppressor genes, vaccines, therapeutic 

30 products and multigene pathways. 
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For example, industrial enzymes that can be produced include, for 
example, cr-amylase, glucanase, phytase and xylanase (see, Goddijn and Pen 
(1995) Trends Biotechnoi. 75:379-387; Pen eta/. (1992) Bio/Technology 
70:292-296; Horvath et al. (2000) Proc. Natl. Acad. Sci. U.S.A. 97:1914- 
5 1919; and e.g., Herbers and Sonnewald (1996) in Transgenic Plants: A 

Production System for Industrial and Pharmaceutical Proteins' 1 Owen and Pen 
Eds., John Wiley & Sons, West Sussex, England), proteases such as 
subtilisin and other industrially important enzymes. Additional proteins that 
can be produced in crops by molecular farming include other industrial 

10 enzymes, for example, proteases, carbohydrate modifying enzymes such as 
glucose oxidase, cellulases, hemicellulases, xylanases, mannanases or 
pectinases, (e.g. Baszczynski et al:, US 5,824,870, US 5,767,379, Bruce et 
al., US 5,804,694). Additionally, the production of enzymes particularly 
valuable in the pulp and paper industry such as ligninases or xylanases also 

15 can be expressed, (Austin-Philips etal., US 5,981,835). Other examples of 
enzymes include phosphatases, oxidoreductases and phytases, (van Ooijen 
etal., US 5,714,474). 

Additionally, expression and delivery of vaccines in plants has been 
proposed(Arntzen and Lam, US 6,136,320, US, 5,914,123, Curtiss and 

20 Cardineau, US 5,679,880, US 5,679,880, US 5,654,184, Lam and Arntzen, 
US 5,612,487, US 6,034,298, Rymerson et al., W09937784A1, as well as 
antibodies (Conrad et al., WO 972900A1, Hein et al., US 5,959,177, Hiatt 
and Hein, US 5,202,422, US 5,639,947, Hiatt et al., US 6,046,037), 
peptide hormones (Vandekerckhove, J.S., US 5,487,991, Brandle et al., 

25 W09967401 A2), blood factors and similar therapeutic molecules. 

Expression of vaccines in edible plants can provide a means for drug delivery 
which is cost effective and particularly suited for the administration of 
therapeutic agents in rural or under developed countries. The plant material 
containing the therapeutic agents could be cultivated and incorporated into 

30 the diet (Lam, D.M., and Arntzen, C.J., US 5,484,719). Similarly, plants 
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used for animal feed can be engineered to express veterinary biologies that 
can provide protection against animal disease, (Rymerson et al. r 
W09937784A1). Antibodies also can be produced in plants, including, for 
example, a gene fusion encoding an antigen-binding single chain Fv protein 
5 (scFv) that recognizes the hapten oxazolone (Fiedler and Conrad (1995) 

Bio/Technology 73:1090-1093) and IgG (Ma et aL (1995) Science 26S:716- 
719). Monoclonal antibodies for therapeutic and diagnostic applications are 
of particular interest. 

Examples of human biopharmaceuticals that can be expressed in 

10 plants include, but are not limited to, albumin (Sijmons et aL (1990)), 

enkephalins (Vandekerckhove eta/. (1989) ), interferon-a (Zhu et (1994) 
and GM-CSF (Ganz et aL (1996) in Transgenic Plants: A Production System 
for industrial and Pharmaceutical Proteins, Owen and Pen Eds., John Wiley & 
Sons, West Sussex, England, pp. 281-297; and Sardana et aL (1998) in 

15 Methods in Biotechnology, Vol. 3: Recombinant Proteins from Plants: 

Production and Isolation of Clinically Useful Compounds, Cunningham and 
Porter, Eds., Humana Press, New Jersey; pp. 77-87). 

Cells containing the artificial chromosomes provided herein can 
advantageously be used in in vitro plant cell-based systems for production of 

20 proteins, particularly several proteins from one cell line, such as multiple 
proteins involved in a biochemical pathway or multivalent vaccines. The 
genes encoding the proteins are introduced into the artificial chromosomes 
which are then introduced into plant cells. Plant cells useful for this purpose 
are those that grow well in culture, or most preferably, plant cells capable of 

25 being regenerated to whole plants. Plants can then be cultivated by common 
methods to produce plant material comprising said heterologous proteins. 
The heterologous proteins can be subject to purification or the plant tissue or 
extracts thereof can be used directly for vaccination, amelioration of disease, 
or processing of material, such as bleaching during pulp and paper 

30 processing or enzymatic conversion of industrial materials or feedstocks. 
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Alternatively, the heterologous gene(s) of interest are transferred into a 
production cell line or plant line that already contains artificial chromosomes 
in a manner that targets the gene(s) to the artificial chromosomes. The cells 
or plants are grown under conditions whereby the heterologous proteins are 
5 expressed. Because the proteins are expressed at high levels in a stable 
permanent extra-genomic chromosomal system, selective conditions are not 
required. 

Selection of host lines for use in artificial chromosome-based protein 
production systems is within the skill of the art, but often will depend on a 

10 variety of factors, including the properties of the heterologous protein to be 
produced, potential toxicity of the protein in the host cell, any requirements 
for post-translational modification ( e.g. , glycosylation, amination, 
phosphorylation) of the protein, transcription factors available in the cells, 
the type of promoter element(s) being used to drive expression of the 

15 heterologous gene, whether production is completely intracellular or the 
heterologous protein will preferably be secreted from the cell, or be 
sequestered or localized, and the types of processing enzymes in the cell. 

Artificial chromosomes can be engineered as platforms for the 
production of specific molecules in plant cells. For example, production of 

20 complex mammalian molecules, such as multichain antibodies, requires a 
number of protein activities not normally found in plant species. It is 
possible to produce an artificial chromosome that comprises all of the 
mamalian activities needed to produce human antibodies, correctly modified 
and processed, by introducing into an artificial chromosome the genes 

25 needed to carry out these activities. Said genes would be modified, for 

example, by placing each gene under the control of a plant promoter, or by 
placing the master control gene, i.e., a gene that controls expression of the 
various genes, under the control of a plant promoter. Alternatively, 
mammalian transcriptional control factors could be introduced, under the 
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control of plant active promoters, to be expressed in a plant cell and cause 
the expression of said target proteins, for example multichain antibodies. 

In this fashion, plant artificial chromosomes are developed, each 
capable of supporting the efficient production of a specific class of valuable 
5 products, for example, antibodies, blood clotting factors, etc. Thus, 

production of products within a class, for example, human antibodies would 
simply involve the introduction of a specific antibody coding sequence, 
without modification into the artificial chromosome engineered specifically for 
the production of human antibodies. The artificial chromosome would 
10 comprise all of the required genetic activities for the proper expression, 
translation and post-translational modification of human antibodies. Such 
artificial chromosomes can be used in a variety of applications, such as, but 
are not limited to, large scale production of numerous specific human 
antibodies. 

15 Advantages of plant cells as host cell lines in the production of 

recombinant proteins include, but are not limited to, the following: (1) 
proteins are post-translationally modified similar to mammalian systems, (2) 
plants can be directed to secrete proteins into stable, dry, intracellular 
compartments of seeds called endosperm protein bodies, which can easily be 

20 collected, (3) the amount of recombinant product that can be produced 

approaches industrial scale levels and (4) health risks due to contamination 
with potential pathogens/toxins are minimized. 

The artificial chromosome-based system for heterologous protein 
production has many advantageous features. For example, as described 

25 above, because the heterologous DIMA is located in an independent, extra- 
genomic artificial chromosome (as opposed to randomly inserted in an 
unknown area of the host cell genome or located as extrachromosomal 
element(s) providing only transient expression), it is stably maintained in an 
active transcription unit and is not subject to ejection via recombination or 

30 elimination during cell division. Accordingly, it is unnecessary to include a 
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selection gene in the host cells and thus growth under selective conditions is 
also unnecessary. Furthermore, because the artificial chromosomes are 
capable of incorporating large segments of DNA, multiple copies of the 
heterologous gene and linked promoter element(s) can be retained in these 
5 chromosomes, thereby providing for high-level expression of the foreign 

protein(s). Alternatively, multiple copies of the gene can be linked to a single 
promoter element and several different genes can be linked in a fused 
polygene complex to a single promoter for expression of, for example, all the 
key proteins constituting a complete metabolic pathway (see, e.g. . Beck von 

10 Bodman et aL (1995) Biotechnology 13:587-591). Alternatively, multiple 
copies of a single gene can be operatively linked to a single promoter, or 
each or one or several copies can be linked to different promoters or multiple 
copies of the same promoter. Additionally, because artificial chromosomes 
have an almost unlimited capacity for integration and expression of foreign 

15 genes, they can be used not only for the expression of genes encoding end- 
products of interest, but also for the expression of genes associated with 
optimal maintenance and metabolic management of the host cell, e.g., genes 
encoding growth factors, as well as genes that facilitate rapid synthesis of 
correct form of the desired heterologous protein product, e.g., genes 

20 encoding processing enzymes and transcription factors as described above. 

The artificial chromosomes are suitable for expression of any proteins 
or peptides, including proteins and peptides that require in vivo 
posttranslational modification for their biological activity. Such proteins 
include, but are not limited to antibody fragments, full-length antibodies, and 

25 multimeric antibodies, tumor suppressor proteins, naturally occurring or 
artificial antibodies and enzymes, heat shock proteins, and others. 

Thus, such cell-based "protein factories" employing artificial 
chromosomes can be generated using artificial chromosomes constructed 
with multiple copies (theoretically an unlimited number or at least up to a 

30 number such that the resulting artificial chromosome is about up to the size 
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of a genomic chromosome (i.e., endogenous)) of protein-encoding genes with 
appropriate promoters, or multiple genes driven by a single promoter, i.e., a 
fused gene complex (such as a complete metabolic pathway in plant 
expression system; see, e.g. . Beck von Bodman (1995) Biotechnology 
5 13:587-591). Once such an artificial chromosome is constructed, it can be 
transferred to a suitable plant species capable of being propagated under 
field conditions, or under conditions that permit the recovery of the intended 
product. Plant cell cultures such as algae can be used in a system analogous 
to mammalian cell culture systems. The advantage of plant based systems 

10* such as this include low input costs for growth, rapid growth rates and 
ability to produce a large biomass economically. 

The ability of artificial chromosomes to provide for high-level 
expression of heterologous proteins in host cells is demonstrated, for 
example, by analysis of mammalian cells containing a mammalian artificial 

15 chromosome, H1D3 and G3D5 cell lines described herein. Northern blot 
analysis of mRNA obtained from these cells reveals that expression of the 
hygromycin-resistance and /? -galactosidase genes in the cells correlates with 
the amplicon number of the megachromosome(s) contained therein. 

Transgenic plants producing these compounds are made by the 

20 introduction and expression of one or potentially many genes using the 
artificial chromosomes provided herein. The vast array of possibilities 
include, but are not limited to, any biological compound which is presently 
produced by any organism such as proteins, nucleic acids, primary and 
intermediary metabolites, carbohydrate polymers, enzymes for uses in 

25 bioremediation, enzymes for modifying pathways that produce secondary 

plant metabolites such as flavonoids or vitamins, enzymes that could produce 
pharmaceuticals and for introducing enzymes that could produce compounds 
of interest to the manufacturing industry such as specialty chemicals and 
plastics. The compounds are roduced by the plant, extracted upon harvest 

30 and/or processing, and used for any presently recognized useful purpose 
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such as pharmaceuticals, fragrances, and industrial enzymes. Alternatively, 
plants produced in accordance with the methods and compositions provided 
herein can be made to metabolize certain compounds, such as hazardous 
wastes, thereby allowing bioremediation of these compounds. 
5 The artificial chromosomes provided herein can be used in methods of 

protein and gene product production, particularly using plant cells as host 
cells for production of such products, and in cellular production systems in 
which the artificial chromosomes provide a reliable, stable and efficient 
means for optimizing the biomanufacturing of important compounds for 

10 medicine and industry. 

2. Genetic alteration of organisms to possess desired traits 
Artificial chromosomes are ideally suited for preparing organisms, such 
as plants, that possess certain desired traits, such as, for example, disease 
resistance, resistance to harsh environmental conditions, altered growth 

15 patterns and enhanced physical characteristics. With respect to plants, the 
choice of the particular nucleic acid that will be delivered to recipient cells via 
artificial chromosomes often will depend on the purpose of the 
transformation. One of the major purposes of transformation of crop and 
tree species is to add some commercially desirable, agronomically important 

20 traits to the plant. Such traits include, but are not limited to, input and 

output traits such as herbicide resistance or tolerance, insect resistance or 
tolerance, disease resistance or tolerance (viral, bacterial, fungal or 
nematode), stress tolerance and/or resistance, as exemplified by resistance 
or tolerance to drought, heat, chilling, freezing, excessive moisture, salt 

25 stress and oxidative stress, increased yields, food content and makeup, 

physical appearance, male sterility, drydown, standability, prolificacy, starch 
quantity and quality, oil quantity and quality, protein quantity and quality and 
amino acid composition. It may be desirable to incorporate one or more 
genes conferring such desirable traits into host plants. 
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a. Herbicide resistance 

The genes encoding phosphinothricin acetyltransferase (bar and pat), 
glyphosate tolerant EPSP synthase genes, the glyphosate degradative 
enzyme gene gox encoding glyphosate oxidoreductase, deh (encoding a 
5 dehalogenase enzyme that inactivates dalapon), herbicide resistant 

(e.^., sulfonylurea and imidazolinone) acetolactate synthase, and bxn genes 
(encoding a nitrilase enzyme that degrades bromoxynil) are all examples of 
herbicide resistant genes for use in plant transformation. The bar and pat 
genes code for an enzyme, phosphinothricin acetyltransferase (PAT), which 

10 inactivates the herbicide phosphinothricin and prevents this compound from 
inhibiting gluatamine synthetase enzymes. The enzyme 5- 
enolpyruvylshikimate 3-phosphate synthase (EPSP synthase) is normally 
inhibited by the herbicide N-(phosphonomethyl)giycine (glyphosate). 
However, genes are known that encode glyphosate-resistant EPSP synthase 

15 enzymes. The deh gene encodes the enzyme dalapon dehalogenase and 
confers resistance to the herbicide dalapon. The bxn gene codes for a 
specific nitrilase enzyme that converts bromoxynil to a non-herbicidal 
degradation product. 

b. Insect and other pest resistance 

20 Insect-resistant organisms may be prepared in which resistance or 

decreased susceptibility to insect-induced disease is conferred by 
introduction into the host organism or embryo of artificial chromosomes 
containing DNA encoding gene products (e.g., ribozymes and proteins that 
are toxic to certain pathogens) that destroy or attenuate pathogens or limit 

25 access of pathogens to the host. Potential insect resistance genes that can 
be introduced into plants via artificial chromosomes include Bacillus 
thuringiensis crystal toxin genes or Bt genes (see, e.g.,, Watrud et al. (1985) 
in Engineered Organisms and the Environment). Bt genes may provide 
resistance to lepidopteran or coleopteran pests such as the European Corn 

30 Borer (ECB). Such Bt toxin genes include the CrylA(b) and CrylAfc) genes. 
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Endotoxin genes from other species of B. thuringiensis which affect insect 
growth or development also may be employed in this regard. Bt gene 
sequences can be modified to effect increased expression in plants, and 
particularly monocot plants. Means for preparing synthetic genes are well 
5 known in the art and are disclosed in, for example, U.S. Patent Nos. 
5,500,365 and 5,689,052. Examples of such modified Bt toxin genes 
include a synthetic Bt CrylA(b) gene (see, e.g., Perlak eta/. (1991) Proc. 
Natl. Acad. Sci. U.S.A. 88:3324-3328) and the synthetic CryfA(c) gene 
termed 1800b (see PCT Application publication no. WO95/06128). 

10 Examples of the types of genes that may be transferred into plants via 

artificial chromosomes to generate disease- and/or insect-resistant transgenic 
plants include, but are not limited to, the cry/A (b) and crylA(c) genes which 
yield products that are highly toxic to two major rice insect pests (the striped 
stem borer and the yellow stem borer) (see, e.g., Cheng eta/. (1998) Proc. 

15 NatL Acad. ScL U.S.A. 95:2767-2772), cry3 genes which encode products 
that are toxic to Coleopteran insects that attack a variety of plants, including 
grains and legumes (see, e.g., U.S. Patent No. 6,023,013), genes (e.g., DNA 
encoding tricothecene 3-O-acetyltransferase) that confer resistance to 
tricothecenes such as those produced by plant fungi [e.g. , Fusarium) in 

20 plants particularly susceptible to fungi (e.g., wheat, rye, barley, oats, and 

maize) (see, e.g., PCT Application publication no. WO 00/60061), and genes 
involved in multi-gene biosynthetic pathways that yield antipathogenic 
substances that have a deleterious effect on the growth of plant pathogens 
(see, e.g., U.S. Patent No. 5,639,949). 

25 Protease inhibitors may also provide insect resistance (see, e.g., 

Johnson etal. (1989) and will thus have utility in plant transformation. The 
use of a protease inhibitor II gene, pin!/, from tomato or potato may be 
particularly useful. The combined effect of the use of a p/n/l gene with a Bt 
toxin gene can produce synergistic insecticidal activity. Other genes that 

30 encode inhibitors of the insect's digestive system, or those that encode 
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enzymes or co-factors that facilitate the production of inhibitors, also may be 
useful. This group may be exemplified by oryzacystatin and amylase 
inhibitors such as those from wheat and barley. 

Genes encoding lectins may confer additional or alternative insecticide 
5 properties. Lectins (originally termed phytohemagglutinins) are multivalent 
carbohydrate-binding proteins which have the ability to agglutinate red blood 
cells from a range of species. Lectins have been identified as insecticidal 
agents with activity against weevils, ECB and rootworm (see, e.g., Murdock 
eta/. (1990) Phytochemistry 23:85-89; Czapla & Lang (1990) J. Econ. 

10 EntomoL 53:2480-2485). Lectin genes that may be useful include, for 
example, barley and wheat germ agglutinin (WGA) and rice lectins 
(Gatehouse et al. (1984) J. ScL Food. Agric. 35:373-380). 

Genes controlling the production of large and small polypeptides active 
against insects when introduced into the insect pests, such as, for example, 

15 lytic peptides, peptide hormones and toxins and venoms, may also be useful 
in generating pest-resistant plants. For example, expression of juvenile 
hormone esterase, directed toward specific insect pests, also may result in 
insecticidal activity, or cause cessation of metamorphosis (see, e.g., 
Hammock et al. (1990) Nature 344:458-461). 

20 Transgenic plants expressing genes which encode enzymes that affect 

the integrity of the insect cuticle are additional examples of genes that may 
be transferred to plants via artificial chromosomes to confer resistance to 
insects. Such genes include those encoding, for example, chitinase, 
proteases, lipases and also genes for the production of nikkomycin, a 

25 compound that inhibits chitin synthesis, the introduction of any of which 
may be used to produce insect-resistant plants. Genes that affect insect 
molting, such as those affecting the production of ecdysteroid UDP-glucosyl 
transferase, also can be useful transgenes. 

Genes that code for enzymes that facilitate the production of 

30 compounds that reduce the nutritional quality of the host plant to insect 
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pests may also be used to confer insect resistance on plants. It may be 
possible, for instance, to confer insecticidal activity on a plant by altering its 
sterol composition. Sterols are obtained by insects from their diet and are 
used for hormone synthesis and membrane stability. Therefore, alterations in 
5 plant sterol composition by expression of genes that directly promote the 
production of undesirable sterols or those that convert desirable sterols into 
undesirable forms, could have a negative effect on insect growth and/or 
development and hence endow the plant with insecticidal activity. 
Lipoxygenases are naturally occurring plant enzymes that have been shown 

10 to exhibit anti-nutritional effects on insects and to reduce the nutritional 
quality of their diet. Therefore, transgenic plants with enhanced 
lipoxygenase activity may be resistant to insect feeding. 

Tripsacum dactyloides is a species of grass that is resistant to certain 
insects, including corn root worm. Tripsacum may thus include genes 

15 encoding proteins that are toxic to insects or are involved in the biosynthesis 
of compounds toxic to insects. Such genes may be useful in conferring 
resistance to insects. It is known that the basis of insect resistance in 
Tripsacum is genetic, because said resistance has been transferred to Zea 
mays via sexual crosses (Branson and Guss, 1972). It is further anticipated 

20 that other cereal, monocot or dicot plant species may have genes encoding 
proteins that are toxic to insects which would be useful for producing insect 
resistant plants. 

Further genes encoding proteins characterized as having potential 
insecticidal activity also may be used as transgenes in accordance herewith. 

25 Such genes include, for example, the cowpea trypsin inhibitor (CpT1: Hilder 
etaL, 1987) which may be used as a rootworm deterrent, genes encoding 
avermectin (Avermectin and Abamectin., Campbell, W.C., Ed., 1989: Ikeda 
etaL, 1987) which may prove particularly useful as a corn rootworm 
deterent, ribosome inactivating protein genes and even genes that regulate 

30 plant structures. Transgenic plants including anti-insect antibody genes and 
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genes that code for enzymes that can convert a non-toxic insecticide (pro- 
insecticide) applied to the outside of the plant into an insecticide inside the 
plant also are contemplated. 

c. Disease resistance 
5 Transgenic organisms, such as plants, that express genes that confer 

resistance or reduce susceptibility to disease are of particular interest. For 
example, the transgene may encode a protein that is toxic to a pathogen, 
such as a virus, fungus, mycotoxin-producing organism, nematode or 
bacterium, but that is not toxic to the transgenic host. 

10 Because multiple genes can be introduced on an. artificial 

chromosome, a series of genes encoding a genetic pathway involved in 
disease resistance or tolerance can be introduced into crop plants. For 
example, it is known that often numerous genes are expressed upon 
pathogen invasion, typically one or more "PR", or pathogen related, proteins 

15 are expressed in response to invasion of a plant bacterial or fungal pathogen. 
One or more of the proteins involved in conferring resistance to pathogens 
can be contained within an artificial chromosome and therefore be expressed 
in a plant cell, in particular a whole transgenic plant as described herein. In 
addition, production of single-chain Fv recombinant antibodies in plants may 

20 extend the range of possibilities for the introduction of pathogen protection 
in crop plants (see, e.g., Tavladoraki et aL (1993) Nature 356:469-472). 

It has been demonstrated that expression of a viral coat protein in a 
transgenic plant can impart resistance to infection of the plant by that virus 
and perhaps other closely related viruses (Cuozzo et aL, 1988. Hemenway et 

25 aL, 1988, Abel et aL, 1986). Expression of antisense genes targeted at 

essential viral functions may also impart resistance to viruses. For example, 
an antisense gene targeted at the gene responsible for replication of viral 
nucleic acid may inhibit replication and lead to resistance to the virus. 
Interference with other viral functions through the use of antisense genes 

30 also may increase resistance to viruses. Further, it may be possible to 
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achieve resistance to viruses through other approaches, including, but not 
limited to the use of satellite viruses. Artificial chromosomes are ideally 
suited for carrying a multiplicity of these genes and DNA sequences which 
are useful for conferring a broad range of resistance to many pathogens. 
5 Genes encoding so-called "peptide antibiotics," pathogenesis related 

(PR) proteins, toxin resistance, and proteins affecting host-pathogen 
interactions such as morphological may also be useful, particularly in 
conferring increased resistance to diseases caused by bacteria and fungi. 
Peptide antibiotics are polypeptide sequences which are inhibitory to growth 

10 of bacteria and other microorganisms. For example, the classes of peptides 
referred to as cepropins and magainins inhibit growth of may species of 
bacteria and fungi. Expression of PR proteins in monocotyledonous plants 
such as maize may be useful in conferring resistance to bacterial disease. 
These genes are induced following pathogen attack on a host plant and have 

15 been divided into at lease five classes of proteins (Bio. Linthorst, and 

Cornelissen, 1990). Included among the PR proteins are £-1, 3-glucanases, 
chitinases, and osmotin and other proteins that are believed to function in 
plant resistance to disease organisms. Other genes have been identified that 
have antifungal properties, e.g., UDA (stinging nettle lectin) and hevein 

20 (Broakaert eta/., 1989; Barkai-Golan etaL, 1978). It is known that certain 
plant diseases are caused by the production of phytotoxins. Resistance to 
these diseases may be achieved through expression of a gene that encodes 
an enzyme capable of degrading or otherwise inactivating the phytotoxin. It 
also is contemplated that expression of genes that alter the interactions 

25 between the host plant and pathogen may be useful in reducing the ability of 
the disease organism to invade the tissues of the host plant, e.g., an 
increase in the waxiness of the leaf cuticle or other morphological 
characteristics. 

d. Environment or stress resistance 
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Improvement of a plant's ability to tolerate various environmental 
stresses such as, but not limited to, drought, excess moisture, chilling, 
freezing, high temperature, salt, and oxidative stress, also can be effected 
through expression of genes therein. It is proposed that benefits may be 
5 realized in terms of increased resistance to freezing temperatures through the 
introduction of an "antifreeze" protein such as that of the Winter Flounder 
(Cutler et a/., 1989) or synthetic gene derivatives thereof. Improved chilling 
tolerance also may be conferred through increased expression of gIycerol-3- 
phosphate acetyltransferase in chloroplasts {Wolter eta/., 1992). Resistance 

10 to oxidative stress in some crop species (often exacerbated by conditions 
such as chilling temperatures in combination with high light intensities) can 
be conferred by expression of superoxide dismutase (Gupta et a/., 1993), 
and may be improved by glutathione reductase (Bowler et al., 1992). Such 
strategies may allow for tolerance to freezing in newly emerged fields as well 

15 as extending later maturity higher yielding varieties to earlier relative maturity 
zones. 

It is contemplated that the expression of genes that favorably effect 
plant water content, total water potential, osmotic potential, and turgor will 
enhance the ability of the plant to tolerate drought. As used herein, the 

20 terms "drought resistance" and drought tolerance" are used to refer to a 

plant's increased resistance or tolerance to stress induced by a reduction in 
water availability, as compared to normal circumstances, and the ability of 
the plant to function and survive in lower-water environments. The 
expression of genes encoding for the biosynthesis of osmotically-active 

25 solutes, such as polyol compounds, may impart protection against drought. 
Within this class are genes encoding for mannitol-L-phosphate 
dehydrogenase (Lee and Saier, 1982) and trehalose-6-phosphate synthase 
(Kaasen et al., 1992). Through the subsequent action of native 
phosphatases in the cell or by the introduction and coexpression of a specific 

30 phosphatase, these introduced genes will result in the accumulation of either 
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mannitol or trehalose, respectively, both of which have been well 
documented as protective compounds able to mitigate the effects of stress. 
Mannitol accumulation in transgenic tobacco has been verified and 
preliminary results indicate that plants expressing high levels of this 
5 metabolite are able to tolerate an applied osmotic stress (Tarczynski et a/., 
1992, 1993). 

Similarly, the efficacy of other metabolites in protecting either enzyme 
function (e.g., alanopine or propionic acid) or membrane integrity [e.g., 
alanopine) has been documented (Loomis et al., 1989), and therefore 

10 expression of genes encoding for the biosynthesis of these compounds might 
confer drought resistance in a manner similar to or complimentary to 
mannitol. Other examples of naturally occurring matabolites that are 
osmotically active and/or provide some direct protective effect during 
drought and/or desiccation include fructose, erythritol {Coxson eta/., 1992), 

15 sorbitol, dulcitol (Karsten et a/., 1992), glucosylglycerol (Reed eta/., 1984; 
ErdMann eta/., 1992), sucrose, stachyose (Koster and Leopold, 1988: 
Blackman eta/., 1992), raffinose (Bernal-Lugo and Leopold, 1992), proline 
(Rensburg eta/., 1993), glycine betaine, ononitol and pinitol (Vernon and 
Bohnert, 1992). Continued canopy growth and increased reproductive 

20 fitness during times of stress will be augmented by introduction and 
expression of genes such as those controlling the osmotically active 
compounds discussed above and other such compounds. Genes which 
promote the synthesis of an osmotically active polyol compound include 
genes which encode the enzymes mannitol- 1 -phosphate dehydrogenase, 

25 trehalose-6-phosphate synthase and myoinositol O-methyltransferase. 

Artificial chromosomes can carry a multiplicity of genes to provide durable 
stress tolerance, for example, concominant expression of proline and ketane 
and/or poly-ols. 

It is contemplated that the expression of specific proteins also may 
30 increase drought tolerance under certain conditions or in certain crop 
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species. These may include proteins such as Late Embryogenic Proteins (see 
Dure etaf., 1989). All three classes of LEAs have been demonstrated in 
maturing {i.e. desiccating) seeds. Within LEA proteins, the Type-ll (dehydrin- 
type) have generally been implicated in drought and/or desiccation tolerance 
5 in vegetative plant parts {i.e. Mundy and Chua, 1988: Piatkowski et aL, 

1990: Yamaguchi-Shinozaki etaf., 1992). Recently, expression of a Type-Ill 
LEA (HVA-1) in tobacco was found to influence plant height, maturity and 
drought tolerance (Fitzpatrick, 1993). In rice, expression of the HVA-1 gene 
influenced tolerance to water deficit and salinity (Xu et al ., 1996). 

10 Expression of structural genes from all three LEA groups may therefore 
confer drought tolerance. Other types of proteins induced during water 
stress include thiol proteases, aldolases and transmembrane transporters 
(Guerrero et a!., 1999), which may confer various protective and/or repair- 
type functions during drought stress. It is also is contemplated that genes 

15 that effect lipid biosynthesis and hence membrane composition might also be 
useful in conferring drought resistance on the plant. 

Many of these genes for improving drought resistance have 
complementary modes of action. Thus, combinations of these genes might 
have additive and/or synergistic effects in improving drought resistance in 

20 plants. Many of these genes also improve freezing tolerance (or resistance): 
the physical stresses incurred during freezing and drought are similar in 
nature and may be mitigated in similar fashion. Benefit may be conferred via 
constitutive expression of these genes, but the preferred means of 
expressing these genes may be through the use of a turgor-induced promoter 

25 (such as the promoters for the turgor-induced genes described in Guerrero et 
a/. f 1990 and Shagan et aL, 1993 which are incorporated herein by 
reference). Spatial and temporal expression patterns of these genes may 
enable plants to better withstand stress. 

It is proposed that expression of genes that are involved with specific 

30 morphological traits that allow for increased water extractions from drying 
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( 

soil would be of benefit. For example, introduction and expression of genes 
that alter root characteristics may enhance water uptake. It also is 
contemplated that expression of genes that enhance reproductive fitness 
during times of stress would be of significant value. For example, expression 
5 of genes that improve the synchrony of pollen shed and receptiveness of the 
female flower parts, i.e., silks, would be of benefit. In addition it is 
proposed that expression of genes that minimize kernel abortion during times 
of stress would increase the amount of grain to be harvested and hence be 
of value. 

10 Given the overall role of water in determining yield, it is contemplated 

that enabling plants to utilize water more efficiently, through the introduction 
and expression of genes, will improve overall performance even when soil 
water availability is not limiting. By introducing genes that improve the 
ability of plants to maximize water usage across a full range of stresses 

15 relating to water availability, yield stability or consistency of yield 
performance may be realized. 

e. Plant agronomic characteristics 
Plants possessing desired traits that might, for example, enhance 
utility, processibility and commercial value of the organisms in areas such as 

20 the agricultural and ornamental plant industries may also be generated using 
artificial chromosomes in the same manner as described above for production 
of disease-resistant organisms. In such instances, the artificial chromosomes 
that are introduced into the organism or embryo contain DNA encoding gene 
products that serve to confer the desired trait in the organism. 

25 For example, transgenic plants having improved flavor properties, 

stability and/or quality are of commercial interest. One possible method for 
generating such plants may include the expression of transgenes, e.g., genes 
encoding cystathionine gamma synthase (CGS), that result in increased free 
methionine levels (see, e.g., PCT Application publication no. WO 00/55303). 
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Two of the factors determining where crop plants can be grown are 
the average daily temperature during the growing season and the length of 
time between frosts. Within the areas where it is possible to grow a 
particular crop, there are varying limitations on the maximal time it is allowed 
5 to grow to maturity and be harvested. For example, a variety to be grown in 
a particular area is selected for its ability to mature and dry down to 
harvestable moisture content within the required period of time with 
maximum possible yield. Therefore, crops of varying maturities are 
developed for different growing locations. Apart from the need to dry down 

10 sufficiently to permit harvest, it is desirable to have maximal drying take 
place in the field to minimize the amount of energy required for additional 
drying post-harvest. Also, the more readily a product such as grain can dry 
down, the more time there is available for growth and kernel fill. Genes that 
influence maturity and/or dry down can be identified and introduced into 

1 5 plant lines using transformation techniques to create new varieties adapted 
to different growing locations or the same growing location, but having 
improved yield to moisture ratio at harvest. Expression of genes that are 
involved in regulation of plant development may be especially useful. 
Genes that would improve standability and other plant growth 

20 characteristics may also be introduced into plants. Expression of new genes 
in plants which confer stronger stalks, improved root systems, or prevent or 
reduce ear droppage would be of great value to the farmer. Introduction and 
expression of genes that increase the total amount of photoassimilate 
available by, for example, increasing light distribution and/or interception 

25 would be advantageous. In addition, the expression of genes that increase 
the efficiency of photosynthesis and/or the leaf canopy would further 
increase gains in productivity. Expression of a phytochrome gene in crop 
plants may be advantageous. Expression of such a gene may be reduce 
apical dominance, confer semidwarfism on a plant, and increase shade 
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tolerance (U.S. Patent No. 5,268,526). Such approaches would allow for 
increased plant populations in the field. 

f . Nutrient utilization 

The ability to utilize available nutrients may be a limiting factor in 
5 growth of crop plants. It may be possible to alter nutrient uptake, tolerate 
pH extremes, mobilization through the plant, storage pools, and availability 
for metabolic activities by the introduction of new agents. These 
modifications would allow a plant such as maize to more efficiently utilize 
available nutrients. An increase in the activity of, for example, an enzyme 

10 that is normally present in the plant and involved in nutrient utilization may 
increase the availability of a nutrient. An example of such an enzyme would 
be phytase. It is further contemplated that enhanced nitrogen utilization by a 
plant is desirable. Expression of a glutamate dehydrogenase gene in plants, 
e.g., E. cofi gdhA genes, may lead to enhanced resistance to the herbicide 

1 5 gluf osinate by incorporation of excess ammonia into glutamate, thereby 
detoxifying the ammonia. Gene expression may make a nutrient source 
available that was previously not accessible, e.g., an enzyme that releases a 
component of nutrient value from a more complex molecule, perhaps a 
macromolecule. Alternatively, artificial chromosomes can carry the 

20 multiplicity of genes governing nodulation and nitrogen fixation in legumes. 
The artificial chromosomes could be used to promote nodulation in non- 
legume species. 

g. Male sterility 

Male sterility is useful in the production of hybrid seed. Male sterility 
25 may be produced through gene expression. For example, it has been shown 
that expression of genes that encode proteins that interfere with 
development of the male inflorescence and/or gametophyte result in male 
sterility. Chimeric ribonuclease genes that express in the anthers of 
transgenic tobacco and oilseed rape have been demonstrated to lead to male 
30 sterility (Mariani et al., 1990). Other methods of conferring male sterility 
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have been described, including gene encoding antisense RNA capable of 
causing male sterility (U.S. Patent Nos. 6,184,439, 6,191,343 and 
5,728,926) and methods utilizing two genes to confer sterility, see, e.g., 
U.S. Patent No. 5,426,041. 
5 A number of mutations were discovered in maize that confer 

cytoplasmic male sterility. Onie mutation in particular, referred to as T 
cytoplasm, also correlates with sensitivity to Southern corn leaf blight. A 
DNA sequence, designated TURF-13 (Levings, 1990), was identified that 
correlates with T cytoplasm. It is proposed that it would be possible through 

10 the introduction of TURF-13 via transformation, to separate male sterility 

from disease sensitivity. As it is necessary to be able to restore male fertility 
for breeding purposes and for grain production, it is proposed that genes 
encoding restoration of male fertility also may be introduced, 
h. Improved nutritional content 

15 Genes may be introduced into plants to improve the nutrient quality or 

content of a particular crop. Introduction of genes that alter the nutrient 
composition of a crop may greatly enhance the feed or food value. For 
example, the protein of many grains is suboptimal for feed and food purposes 
especially when fed to pigs, poultry/ and humans. The protein is deficient in 

20 several amino acids that are essential in the diet of these species, requiring 
the addition of supplements to the grain. Limiting essential amino acids may 
include lysine, methionine, tryptophan, threonine, valine, arginine, and 
histidine. Some amino acids become limiting only after corn is supplemented 
with other inputs for feed formulations. The levels of these essential amino 

25 acids in seeds and grain may be elevated by mechanisms which include, but 
are not limited to, the introduction of genes to increase the biosynthesis of 
the amino acids, increase the storage of the amino acids in proteins, or 
increase transport of the amino acids to the seeds or grain. 

The protein composition of a crop may be altered to improve the 

30 balance of amino acids in a variety of ways including elevating expression of 
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native proteins, decreasing expression of those with poor composition 
changing the composition of native proteins, or introducing genes encoding 
entirely new proteins possessing superior composition. 

The introduction of genes that alter the oil content of a crop plant may 
5 also be of value. Increases in oil content may result in increases in 

metabolizable-energy-content and density of seeds for use in feed and food. 
The introduced genes may encode enzymes that remove or reduce rate- 
limitations or regulated steps in fatty acid or lipid biosynthesis. Such genes 
may include, but are not limited to, those that encode acetyl-CoA 

10 carboxylase, ACP-acyltransf erase, £-ketoacyl-ACP synthase, plus other well 
known fatty acid biosynthetic activities. Other possibilities are genes that 
encode proteins that do not possess enzymatic activity such as acyl-carrier 
proteins. Genes may be introduced that after the balance of fatty acids 
present in the oil providing a more healthful or nutritive feedstuff. The 

15 introduced DNA also may encode sequences that block expression of 

enzymes involved in fatty acid biosynthesis, altering the proportions of fatty 
acids present in crops. 

Genes may be introduced that enhance the nutritive value of the 
starch component of crops, for example by increasing, or in some cases 

20 decreasing, the degree of branching, resulting in improved utilization of the 
starch in livestock by delaying its metabolism. Additionally, other major 
constituents of a crop may be altered, including genes that affect a variety of 
other nutritive, processing, or other quality aspects. For example, 
pigmentation may be increased or decreased. 

25 Feed or food crops may also possesses insufficient quantities of 

vitamins, requiring supplementation to provide adequate nutritive value. 
Introduction of genes that enhance vitamins biosynthesis may be envisioned 
including, for example, vitamins A (e.g. rice with Vitamin A or golden rice), 
E, B12 choline, and the like. Mineral content may also be sub-optimal. Thus 

30 genes that affect the accumulation or availability of compounds containing 



WO 02/096923 



PCT/US02/17451 



-151- 

phosphorus, sulfur, calcium, manganese, zinc, and iron among others would 
be Valuable. 

Numerous other examples of improvements of crops may be effected 
using the artificial chromosomes, with appropriate heterologous genes 
5 contained therein, in accordance with the methods and compositions 

provided herein. The improvements may not necessarily involve grain, but 
may, for example, improve the value of a crop for silage. Introduction of 
DNA to accomplish this might include sequences that alter lignin production 
such as those that result in the "brown midrib" phenotype associated with 

10 superior feed value for cattle. 

In addition to direct improvements in feed or food value, genes also 
may be introduced which improve the processing of crops and improve the 
value of the products resulting from the processing. One use of crops is via 
wetmilling. Thus, genes that increase the efficiency and reduce the cost of 

15 such processing, for example, by decreasing steeping time may also find use. 
Improving the value of wetmilling products may include altering the quantity 
or quality of starch, oil, corn gluten meal, or the components of gluten feed. 
Elevation of starch may be achieved through the identification and 
elimination of rate limiting^steps in starch biosynthesis or by decreasing 

20 levels of the other components of crops resulting in proportional increases in 
starch. 

Oil is another product of wetmilling, the value of which may be 
improved by introduction and expression of genes. Oil properties maybe be 
altered to improve its performance in the production and use of cooking oil, 

25 shortenings, lubricants or other oil-derived products or improvements of its 
health attributes when used in the food-related applications. Fatty acids also 
may be synthesized which upon extraction can serve as starting materials for 
chemical syntheses. The changes in oil properties may be achieved by 
altering the type, level, or lipid arrangement of the fatty acids present in the 

30 oil. This in turn may be accomplished by the addition of genes that encode 
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enzymes that catalyze the synthesis of new fatty acids and the lipids 
possessing them or by increasing levels of native fatty acids while possibly 
reducing levels of precursors. Alternatively, DNA sequences may be 
introduced which slow or block steps in fatty acid biosynthesis resulting in 
5 the increase in precursor fatty acid intermediates. Genes that might be 

added include desaturases, epoxidases, hydratases, dehydratases and other 
enzymes that catalyze reactions involving fatty acid intermediates. 
Representative examples of catalytic steps that might be blocked include the 
desaturations from stearic to oleic acid and oleic to linolenic acid resulting in 

10 the respective accumulations of stearic and oleic acids. Another example is 
the blockage of elongation steps resulting in the accumulation of C8 to C12 
saturated fatty acids. 

i. Production of chemicals or biologicals 
Transgenic plants can be used as protein production systems to 

15 generate recombinant products ranging from industrial enzymes, viral 

antigens, vaccines, antibodies, human blood proteins, cytokines, growth 
factors, enkephalins, serum albumin and other proteins of clinical relevance 
and pharmaceuticals. For example, enzymes including cr-amylase, glucanase, 
phytase and xylanase (see, Goddijn and Pen (1995) Trends BiotechnoL 

20 73:379-387; Pen et al. (1992) Bio/Technology 70:292-296; Horvath et at. 
(2000) Proc. Natl. Acad. Sci. U.S.A. 57:1914-1919; and e.g., Herbers and 
Sonnewald (1996) in Transgenic Plants: A Production System for industrial 
and Pharmaceutical Proteins" Owen and Pen Eds., John Wiley & Sons, West 
Sussex, England). 

25 Examples of medically relevant proteins that may be produced in 

plants include surface antigens of viral pathogens, such as hepatitis B virus 
and transmissible gastroenteritis virus spike protein, for use in vaccines. The 
proteins thus produced may be isolated and administered through standard 
vaccine introduction methods or through the consumption of the edible 

30 transgenic plant as food which can be taken orally (see, e.g., U.S. Patent No. 
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6,136,320 and Mason et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 53:11745- 
11749). HIV, rhinovirus, malarial and rabies virus antigens are additional 
examples of that may be expressed in plants as candidate vaccines (see, 
e.g., Porta eta/. (1994) Virol. 202:949-955; Turpen eta/. (1995) 
5 Bio/Technology 73:53-57; and McGarvey eta/. (1995) Bio/Techno/ogy 

13: 1484-1 487). Antibodies may also be produced in plants, including, for 
example, a gene fusion encoding an antigen-binding single chain Fv protein 
(scFv) that recognizes the hapten oxazolone (Fiedler and Conrad (1995) 
Bio/Techno/ogy 73:1090-1093) and IgG (Ma eta/. (1995) Science 263:716- 
10 719). 

Examples of human biopharmaceuticals that may be expressed in 
plants include, but are not limited to, albumin (Sijmons et aL (1990)), 
enkephalins (Vandekerckhove et al. (1989) ), interferon-a (Zhu et al. (1994) 
and GM-CSF (Ganz et al. (1996) in Transgenic Plants: A Production System 

15 for Industrial and Pharmaceutical Proteins, Owen and Pen Eds., John Wiley & 
Sons, West Sussex, England, pp. 281-297; and Sardana et al. (1998) in 
Methods in Biotechnology, Vol. 3: Recombinant Proteins from Plants: 
Production and Isolation of Clinically Useful Compounds, Cunningham and 
Porter, Eds., Humana Press, New Jersey; pp. 77-87). 

20 Transgenic plants producing these compounds are made possible by 

the introduction and expression of one or potentially many genes using the 
artificial chromosomes provided herein. The vast array of possibilities 
include, but are not limited to, any biological compound which is presently 
produced by any organism such as proteins, nucleic acids, primary and 

25 intermediary metabolites, carbohydrate polymers, enzymes for uses in 

bioremediation, enzymes for modifying pathways that produce secondary 
plant metabolites such as flavonoids or vitamins, enzymes that could produce 
pharmaceuticals and for introducing enzymes that could produce compounds 
of interest to the manufacturing industry such as specialty chemicals and 

30 plastics. The compounds may be produced by the plant, extracted upon 
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harvest and/or processing, and used for any presently recognized useful 
purpose such as pharmaceuticals, fragrances, and industrial enzymes to 
name a few. Alternatively, plants produced in accordance with the methods 
and compositions provided herein may be made to metabolize certain 
5 compounds, such as hazardous wastes, thereby allowing bioremediation of 
these compounds. 

j. Non-protein-expressing sequences 
Nucleic acids may be introduced into plants that are designed to 
down-regulate or supress a plant-encoded gene. A number of different means 

10 to achieve down regulation have been demonstrated in the art, including 

antisense RNA, ribozymes and co-suppression. The use of antisense RNA to 
suppress plant genes is described, for example, in U.S. Patent Nos. 
4,801,540, 5,107,065 and 5,453,566. In such methods, an "antisense" 
gene is constructed that encodes an RNA that is complementary to the 

15 mRNA of a resident plant gene, such that expression of the antisense gene 
inhibits the translation of the mRNA of the resident plant gene. Thus, the 
activity of the resident gene is down-regulated. 

An additional method of down regulating gene activities involves 
ribozymes, or catalytic hammerhead hairpin RNA structures. The use of 

20 ribozymes is described, for example, in U.S. Patent Nos. 4,987,071, 
5,037,746, 5,1 16,742 and 5,354,855. These methods rely on the 
expression of small catalytic "hammerhead" RNA molecules that are capable 
of binding to and cleaving specific RNA sequences. Ribozymes designed to 
specifically recognize a resident plant mRNA can be used to cleave the 

25 mRNA and prevent its proper expression. 

Essentially a more or less equivalent down-regulation control of gene 
activities by ribozymes and antisense can be achieved by adding additional 
copies of the gene to be regulated. The process is referred to as co- 
suppression and is described in, for example, U.S. Patent Nos. 5,034,323, 

30 5,283,184 and 5,231,020. 
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Numerous plant genes may be targeted for down regulation. For 
example, a gene may be down-regulated that encodes an enzyme that 
catalyzes a reaction in a plant. Reduction of the enzyme activity may reduce 
or eliminate products of the reaction which include any enzymatically 
5 synthesized compound in the plnat such as fatty acids, amino acids, 

carbohydrates, nucleic acids and the like. Alternatively, the protein may be a 
storage protein, such as zein, or a structural protein, the decreased 
expression of which may lead to changes in seed amino acid composition or 
plant morphological changes, respectively. The possibilities cited above are 
10 provided only by way of example and do not represent the full range of 
applications. 

(1). Antisense RNA 

Genes may be constructed, which when transcribed, produce 
antisense RNA that is complementary to all or part(s) of a targeted 

15 messenger RNA(s). The antisense RNA reduces production of the 

polypeptide product of the messenger RNA. The polypeptide product may be 
any protein encoded by the plant genome. The aforementioned genes will be 
referred to as antisense genes. An antisense gene may thus be introduced 
into a plant by transformation methods to produce a transgenic plant with 

20 reduced expression of a selected protein of interest. For example, the 

protein may be an enzyme that catalyzes a reaction in the plant. Reduction 
of the enzyme activity may reduce or eliminate products of the reaction 
which include any enzymatically synthesized compound in the plant such as 
fatty acids, amino acids, carbohydrates, nucleic acids and the like. 

25 Alternatively, the protein may be a storage protein, such as a zein, or a 

structural protein, the decreased expression of which may lead to changes in 
seed amino acid composition or plant morphological changes respectively. 
The possibilities cited above are provided only by way of example and do not 
represent the full range of applications. 

30 (2.) Ribozymes 
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Genes also may be constructed or isolated, which when transcribed, 
produce RNA enzymes (ribozymes) which can act as endoribonucleases and 
catalyze the cleavage of RNA molecules with selected sequences. The 
cleavage of selected messenger RNAs can result in the reduced production of 
5 their encoded polypeptide products. These genes may be used to prepare 
transgenic plants which possess them. The transgenic plants may possess 
reduced levels of polypeptides including, but not limited to, the polypeptides 
cited above. 

Ribozymes are RNA-protein complexes that cleave nucleic acids in a 
10 site-specific fashion. Ribozymes have specific catalytic domains that 

possess endonuclease activity (Kim and Cech, 1987; Gerlach etaL, 1987; 
Forster and Symons, 1987). For example, a large number of ribozymes 
accelerate phosphoester transfer reactions with a high degree of specificity, 
often cleaving only one of several phophoesters in an oligonucleotide 
15 substrate (Cech etaL, 1981; Michel and Westhof, 1990); Reinhold-Hurek 
and Shub, 1992). This specificity has been attributed to the requirement 
that the substrate bind via specific base-pairing interactions to the internal 
guide sequence ("IGS") of the ribozyme prior to chemical reaction. 

Ribozyme catalysis has primarily been observed as part of sequence- 
20 specific cleavage/ligation reactions involving nucleic acids (Joyce, 1 989; 

Cech etaL, 1981). For example, U.S. Patent 5,354,855 reports that certain 
ribozymes can act as endonucleases with a sequence specificity greater than 
that of known ribonucleases and approaching that of the DNA restriction 
enzymes. 

25 Several different ribozyme motifs have been described with RNA 

cleavage activity (Symons, 1992). Examples include sequences from the 
Group I self splicing introns including Tobacco Ringspot Virus (Prody etaL, 
1986), Avacado Sunblotch Viroid (Palukaitis etaL, 1979; Symons, 1981) 
and Lucerne Transient Streak Virus (Forster and Symons, 1987). Sequences 
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f rom these and related viruses are referred to as hammerhead ribozyme 
based on a predicted folded secondary structure. 

Other suitable ribozymes include sequences from RNase P with RNA 
cleavage activity (Yuan etal., 1992; Yuan and Altman, 1994; U.S. Patents 
5 5,168,053 and 5,624,824), hairpin ribozyme structures (Berzal-Herranz et 
al. t 1992; Chowrira eta/., 1993) and Hepatitis Delta virus based ribozymes 
(U.S. Patent 5,625,047). The general design and optimization of ribozyme 
directed RNA cleavage activity has been discussed in detail (Haseihoff and 
Gerlach, 1988; Symons, 1992; Chowrira etal., 1994; Thompson et aL, 
10 1995). 

The other variable on ribozyme design is the selection of a cleavage 
site on a given target RNA. Ribozymes are targeted to a given sequence by 
virtue of annealing to a site by complementary base pair interactions. Two 
stretches of homology are required for this targeting. These stretches of 

15 homologous sequences flank the catalytic ribozyme structure defined above. 
Each stretch of homologous sequence can vary in length from 7 to 1 5 
nucleotides. The only requirement for defining the homologous sequences is 
that, on the target RNA, they are separated by a specific sequence which is 
the cleavage site. For hammerhead ribozyme, the cleavage site is a 

20 dinucleotide sequence on the target RNA is a uracil (U) followed by either an 
adenine, cytosine or uracil (A, C or U) (Perriman etal., 1992; Thompson et 
al., 1995). The frequency of this dinucleotide occurring in any given RNA is 
statistically 3 out of 16. Therefore, for a given target messenger RNA of 
1 ,000 bases, 1 87 dinucleotide cleavage sites are statistically possible. 

25 Designing and testing ribozymes for efficient cleavage of a target RNA 

is a process well known to those skilled in the art. Examples of scientific 
methods for designing and testing ribozymes are described by Chowrira et al. 
(1994) and Lieber and Strauss (1995), each incorporated by reference. The 
identification of operative and preferred sequences for use in down regulating 

30 a given gene is simply a matter of preparing and testing a given sequence, 
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and is a routinely practiced "screening" method known to those of skill in the 
art. 

(3.) Induction of gene silencing 
It also is possible that genes may be introduced to produce transgenic 
5 plants which have reduced expression of a native gene product by the 

mechanism of co-suppression. It has been demonstrated in tobacco, tomato, 
and petunia (Goring et aL, 1991; Smith etaL, 1990; Napoli etaL, 1990; van 
der Krol etaL, 1990) that expression of the sense transcript of a native gene 
will reduce or eliminate expression of the native gene in a manner similar to 

10 that observed for antisense genes. The introduced gene may encode ail or 
part of the targeting native protein but its translation may not be required for 
reduction of levels of that native protein. 

(4.) Non-RNA-expressing sequences 
DNA elements including those of transposable elements such as Ds, 

15 Ac, or MU, may be inserted into a gene to cause mutations. These DNA 
elements may be inserted in order to inactivate (or activate) a gene and 
thereby "tag" a particular trait. In this instance the transposable element 
does not cause instability of the tagged mutation, because the utility of the 
element does not depend on its ability to move in the genome. Once a 

20 desired trait is tagged, the introduced DNA sequence may be used to clone 
the corresponding gene, e.g., using the introduced DNA sequence as a PCR 
primer together with PCR gene cloning techniques (Shapiro, 1 983; Dellaporta 
etaL, 1988). Once identified, the entire gene(s) for the particular trait, 
including control or regulatory regions where desired, may be isolated, cloned 

25 and manipulated as desired. The utility of DNA elements introduced into an 
organism for purposes of gene tagging is independent of the DNA sequence 
and does not depend on any biological activity of the DNA sequence, he., 
transcription into RNA or translation into protein. The sole function of the 
DNA element is to disrupt the DNA sequence of a gene. 
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It is contemplated that unexpressed DNA sequences, including 
synthetic sequences, could be introduced into cells as proprietary "labels" of 
those cells and plants and seeds thereof. It would not be necessary for a 
label DMA element to disrupt the function of a gene endogenous to the host 
5 organism, as the sole function of this DNA would be to identify the origin of 
the organism. For example, one could introduce a unique DNA sequence into 
a plant and this DNA element would identify all cells, plants, and progeny of 
these cells as having arisen from that labeled source. It is proposed that 
inclusion of label DNAs would enable one to distinguish proprietary 

10 germplasm or germplasm derived from such, from unlabelled germplasm. 
Another possible element which may be introduced is a matrix 
attachment region element (MAR), such as the chicken lysozyme A element 
(Stief, 1989), which can be positioned around an expressible gene of interest 
to effect an increase in overall expression of the gene and diminish position 

15 dependent effects upon incorporation into the plant genome (Stief et a/., 

1989; Phi-Van etaL, 1990). Sequences such as MARs can be included on 

the artificial chromosome to enhance gene expression. 

3. Transgenic models for evaluation of genes and discovery of 
new traits 

20 Of significant interest is the use of plants and plant cells containing 

artificial chromosomes for the evaluation of new genetic combinations and 
discovery of new traits. Artificial chromosomes, by virtue of the fact that 
they can contain significant amounts of DNA can also therefore encode 
numerous genes and accordingly a multiplicity of traits. It is contemplated 

25 here that artificial chromosomes, when formed from one plant species, can 
be evaluated in a second plant species. The resultant phenotypic changes 
observed, for example, can indicate the nature of the genes contained within 
the DNA containing the artificial chromosome, and hence permit the 
identification of new genetic activities. Artificial chromsomes containing 

30 euchromatic DNA or partially containing euchromatic DNA can serve as a 
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valuable source of new traits when transferred to an alien plant cell 
environment. For example, it is contemplated that artificial chromosomes 
derived from dicot plant species can be introduced into monocot plant 
species by transfering a dicot artificial chromosome. The dicot artificial 
5 chromosome containing a region of euchromatic DNA containing expressed 
genes. 

The artificial chromosomes can be generated or manipulated in such a 
fashion that a large region of naturally occurring plant DNA becomes 
incorporated into the artificial chromosome. This allows the artificial 

10 chromosome to contain new genetic activities and hence carry new traits. 
For example, an artificial chromosome can be introduced into a wild relative 
of a crop plant under conditions whereby a portion of the DNA present in the 
chromosomes of the wild relative is transferred to the artificial chromosome. 
After isolation of the artificial chromosome, this naturally occurring region of 

15 DNA from the wild relative, now located on the artificial chromosome can be 
introduced into the domesticated crop species and the genes encoded within 
the transferred DNA expressed and evaluated for utility. New traits and gene 
systems can be discovered in this fashion. 

Artificial chromosomes modified to recombine with plant DNA offer 

20 many advantages for the discovery and evaluation of traits in different plant 
species. When the artificial chromosome containing DNA from one plant 
species is introduced into a new plant species, new traits and genes can be 
introduced. This use of an artificial chromosome allows for the ability to 
overcome the sexual barrier that prevents transfer of genes from one plant 

25 species to another species. Using artificial chromosomes in this fashion 

allows for many potentially valuable traits to be identified including traits that 
are typically found in wild species. Other valuable applications for artificial 
chromosomes include the ability to transfer large regions of DNA from one 
plant species to another, DNA encoding potentially valuable traits such as 

30 altered oil, carbohydrate or protein composition, multiple genes encoding 
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enzymes capable of producing valuable plant secondary metabolites, genetic 
systems encoding valuable agronomic traits such as disease and insect 
resistance, genes encoding functions that allow association with soil 
bacterium such as growth promoting bacteria or nitrogen fixing bacteria, or 
5 genes encoding traits that confer freezing, drought or other stress tolerances. 
In this fashion, artificial chromosomes can be used to discover regions of 
plant DNA that encode valuable traits. 

The artificial chromosome can also be designed to allow the transfer 
and subsequent incorporation of these valuable traits now located on the 

10 artificial chromosome into the natural chromosomes of a plant species. In 
this fashion the artificial chromosomes can be used to transfer large regions 
of DNA encoding traits normally found in one plant species into another plant 
species. In this fashion, it is possible to derive a plant cell that no longer 
needs to carry an artificial chromosome to posses the new trait. Thus the 

1 5 artificial chromosome would serve as the transfer mechanism to permit the 
formation of plants with greater degree of genetic diversity. 

An artificial chromosome can be designed in a variety of ways to 
accomplish the afore-mentioned purposes. An artificial chromosome can be 
modified to contain sequences that promote homologous recombination 

20 within plant cells, or be modified to contain a genetic system that functions 
as a site-specific recombination system. For example, the DNA sequence of 
Arabidopsis is now known. To construct an artificial chromosome capable of 
recombining with a specific region of Arabidopsis DNA, a sequence of 
Arabidopsis DNA, normally located near a chromosomal location encoding 

25 genes of potential interest can be introduced into an artificial chromosome by 
methods provided herein. It may be desireable to include a second region of 
DNA within the artificial chromosome that provides a second flanking 
sequence to the region encoding genes of potential interest, to promote a 
double recombination event which would ensure transfer of the entire 

30 chromosomal region encoding genes of potential interest to the artificial 
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chromosome. The modified artificial chromosome, containing the DNA 
sequences capable of homologous recombination region can then be 
introduced into Arabidopsis cells and the homologous recombination event is 
selected. 

5 It is convenient to include a marker gene to allow for the selection of a 

homologous recombination event. The marker gene is preferably inactive 
unless activated by an appropriate homologous recombination event. For 
example, US 5,272,071, describes a method where an inactive plant gene is 
activated by a recombination event such that desired homologous 

10 recombination events can be easily scored. Similarly, US 5,501,967 

describes a method for the selection of homologous recombination events by 
activation of a silent selection gene first introduced into the plant DNA, the 
gene being activated by an appropriate homologous recombination event. 
Both of these methods can be applied to enable a selective process to be 

15 included in to select for recombination between an artificial chromosome and 
a plant chromosome. Once the homologous recombination event is 
detected, the artificial chromosome, once selected, is isolated and introduced 
into a recipient cell, for example, tobacco, corn, wheat or rice, and the 
expression of the newly introduced DNA sequences evaluated. Selection of 

20 recombinant events can take place in cell culture, or following seed formation 
and screening of seedling plants or seed itself. 

Phenotypic changes in the recipient plant cells containing the artificial 
chromosome, or in regenerated plants containing the artificial chromosome, 
allows for the evaluation of the nature of the traits encoded by the genes of 

25 interest, for example, Arabidopsis DNA, under conditions naturally found in 
plant cells, including the naturally occurring arrangement of DNA sequences 
responsible for the developmental control of the traits in the normal 
chromosomal environment. 

Traits such as durable fungal or bacterial disease resistance, new oil and 

30 carbohydrate compositions, valuable secondary metabolites such as 
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phytosterols, flavonoids, efficient nitrogen fixation or mineral utilization, 
resistance to extremes of drought, heat or cold are all found within different 
populations of plant species and are often governed by multiple genes. The use 
of single gene transformation technologies does not permit the evaluation of the 
5 multiplicity of genes controlling many valuable traits. Thus, incorporation of 
these genes into artificial chromosomes allows the rapid evaluation of the utility 
of these genetic combinations in heterologous plant species. 

The large scale order and structure of the artif icial chromosome provides 
a number of unique advantages in screening for new utilities or new phenotypes 

10 within heterologous plant species. The size of new DNA that can be carried by 
an artificial chromosome can be millions of base pairs of DNA, representing 
potentially numerous genes that may have different or new utility in a 
heterologous plant cell. The artificial chromosome is a "natural" environment 
for gene expression, the problems of variable gene expression and silencing 

1 5 seen for genes transferred by random insertion into a genome should not be 
observed. Similarly, there is no need to engineer the genes for expression, and 
the genes inserted would not need to be recombinant genes. Thus, transferred 
genes are fully expected to be expressed in the typical temporal and spatial 
fashion as observed in the species from where the genes were initially isolated. 

20 A valuable feature for these utilities is the ability to isolate the artificial 
chromosomes and to further isolate, manipulate and introduce into other cells 
artificial chromosomes carrying unique genetic compositions. 

Thus, the use of artificial chromosomes and homologous recombination 
in plant cells can be used to isolate and identify many valuable crop traits. In 

25 addition to the use of artificial chromosomes for the isolation and testing of 
large regions of naturally occurring DNA, methods for the use of artificial 
chromosomes and cloned DNA are also contemplated. Similar to that described 
above, artificial chromsomes can be used to carry large regions of cloned DNA, 
including that derived from other plant species. 
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The ability to incorporate DNA elements into artificial chromosomes as 
they are being formed allows for the development of artificial chromosomes 
specifically engineered as a platform for testing of new genetic combinations, 
or "genomic" discoveries for model species such as Arabldopsis. Specific 
5 "recombinase" systems can be used in plant cells to excise or re-arrange genes; 
these same systems can be used to derive new gene combinations contained 
on an artificial chromosome. In this regard, it is contemplated that the use of 
site specific recombination sequences can have considerable utility in 
developing artificial chromosomes containing DNA sequences recognized by 

10 recombinase enzymes and capable of accepting DNA sequences containing 
same. The iise of site-specific recombination as a means to target an 
introduced DNA to a specific locus has been demonstrated in the art and such 
methods can be employed. The recombinase systems can also be used to 
transfer the cloned DNA regions contained within the artificial chromosome to 

15 the naturally occurring plant chromosomes. 

Many site specific recombinases have been described in the literature 
(Kilby et aL, Trends in Genetics, 9(12): 413-418, 1993). Among these are: 
an activity identified as Ft encoded by the pSR1 plasmid of Zygosaccharomyes 
rouxii, FLP encoded for the 2um circular plasmid from Saccharomyces 

20 cerevisiae and Cre-lox from the phage P1 . 

The integration function of site specific recombinases is contemplated as 
a means to assist in the derivation of genetic combinations on artificial 
chromosomes. In order to accomplish this, it is contemplated that a first step 
of introducing site-specific recombinase sites into the genome of a plant cell in 

25 an essentially random manner is conducted, such that the plant cell has one or 
more site-specific recombinase recognition sequences on one or more of the 
plant chromosomes. An artificial chromosome is then introduced into the pant 
cell, the artificial chromosome engineered to contain a recombinase recognition 
site capable of being recognized by a site specific recombinase. Optionally a 

30 gene encoding a recombinase enzyme is also included, preferably under the 
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control of an inducible promoter. Expression of the site specific recombinase 
enzyme in the plant cell, either by induction of a inducible recombinase gene, 
or transient expression of a recombinase sequence causes a site-specific 
recombination event to take place, leading to the insertion of a region of the 
5 plant chromosomal DNA containing the recombinase recognition site into the 
recombinase recognition site of the artificial chromosome, forming an artificial 
chromosome containing plant chromosomal DNA. The artificial chromosome 
can be isolated and introduced into a heterologous host, preferably a plant host, 
and expression of the newly introduced plant chromosomal DNA can be 

10 monitored and evaluated for desirable phenotypic changes. Accordingly, 
carrying out this recombination with a population of plant cells wherein the 
chromosomally located recombinase recognition site is randomly scattered 
throughout the chromosomes of the plant can lead to the formation of a 
population of artificial chromosomes, each with a different region of plant 

15 chromosomal DNA, each representing a new genetic combination. 

This particular method involves the precise site-specific insertion of 
chromosomal DNA into the artificial chromosome. This precision has been 
demonstrated in the art. For example, Fukushige and Sauer (Proc. Natl. Acad. 
Sci. USA, 89:7905-7909, 1992) demonstrated that the Cre-fox homologous 

20 recombination system could be successfully employed to introduce DNA into a 
predefined locus in a chromosome of mammalian cells. In this demonstration 
a promoter-less antibiotic resistance gene modified to include a fox sequence at 
the 5' end of the coding region was introduced into CHO cells. Cells were re- 
transformed by electroporation with a plasmid that contained a promoter with 

25 a /ox sequence and a transiently expressed Cre recombinase gene. Under the 
conditions employed, the expression of the Cre enzyme catalyzed the 
homologous recombination between the fox site in the chromosomally located 
promoter-less antibiotic resistance gene and the /ox site in the introduced 
promoter sequence leading to the formation of a functional antibiotic resistance 

30 gene. The authors demonstrated efficient and correct targeting of the 
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introduced sequence, 54 of 56 lines analyzed corresponded to the predicted 
single copy insertion of the DNA due to Cre catalyzed site specific homologous 
recombination between the lox sequences. 

The use of the same Cre-lox system has been demonstrated in plants 
5 (Dale and Ow, Gene 91:79-85, 1995) to specifically excise, delete or insert 
DNA. The precise event is controlled by the orientation of fox DNA sequences, 
in c/s the lox sequences direct the Cre recombinase to either delete (lox 
sequences in direct orientation) or invert {lox sequences in inverted orientation) 
DNA flanked by the sequences, while in trans the lox sequences can direct a 

10 homologous recombination event resulting in the insertion of a recombinant 
DNA. Accordingly a lox sequence may be first added to a genome of a plant 
species capable of being transformed and regenerated to a whole plant to serve 
as a recombinase target DNA sequence for recombination with an artificial 
chromosome. The lox sequence may be optimally modified to further contain 

15 a selectable marker which is inactive but can be activated by insertion of the lox 
recombinase recognition sequence into the artificial chromosome. 

A promoterless marker gene or selectable marker gene linked to the 
recombinase recognition sequence, which is first inserted into the chromosomes 
of a plant cell can be used to engineer a platform chromosome. A promoter is 

20 linked to a recombinase recognition site, in an orientation that allows the 
promoter to control the expression of the marker or selectable marker gene 
upon recombination within the artificial chromosome. Upon a site-specific 
recombination event between a recombinase recognition site in a plant 
chromosome and the recombinase recognition site within the the introduced 

25 artificial chromosome, a cell is derived with a recombined artificial chromosome, 
the artificial chromosome containing an active marker or selectable marker 
acitivity that permits the identification and or selection of the cell. 

The artificial chromosomes can be transferred to other plant species and 
the functionality of the new combinations tested. The ability to conduct such 

30 an inter-chromosomal transfer of sequences has been demonstrated in the art. 
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For example, the use of the Cre-lox recombinase system to cause a 
chromosome recombination event between two chromatids of different 
chromosomes has been shown 

Any number of recombination systems may be employed (see, U.S. 
5 provisional application Serial No. filed the same day herewith under attorney 
docket no. 24601-P420). Such systems include, but are not limited to, 
bacterially derived systems such as the Int/sff system of phage lambda and the 
Glnfgix system. 

More than one recombination system may be employed, including, for 

10 example, one recombinase system for the introduction of DNA into an artificial 
chromosome, and a second recombinase system for the subsequent transfer of 
the newly introduced DNA contained within an artificial chromosome into the 
naturally occurring chromosome of a second plant species. The choice of the 
specific recombination system used will be dependent on the nature of the 

15 modification contemplated. 

By having the ability to isolate an artificial chromosome and in particular 
artificial chromosomes containing plant chromosomal DNA introduced via site- 
specific recombination and re-introduce the chromosome into other cells, 
particularly plant cells, these new combinations can be evaluated in different 

20 crop species without the need to first isolate and modify the genes, or carry out 
multiple transformations or gene transfers to achieve the same combination 
isolation and testing combinations of the genes in plants. The use of a site 
specific recombinase and artificial chromosomes also allows the convenient 
recovery of the plant chromosomal region into other recombinant DNA vectors 

25 and systems for manipulation and study. 

The artificial chromosomes can be engineered as platforms to accept 
large regions of cloned DNA, such as that contained in Bacterial Artificial 
Chromosomes (BACs) or Yeast Artificial Chromosomes (YACs). It is further 
contemplated, that as a result of the typical structure of amplification-based 

30 artificial chromosomes, such as, for example, SATACS (or ACes), containing 
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tandemly repreated DNA blocks, that more than cloned DNA sequence can be 
introduced by recombination processes. In particular recombination within a 
predefined region of the tandemly repreated DNA within the artifical 
chromosome provides a mechanism to "stack" numerous regions of cloned 
5 DNA, including large regions of DNA contained within BACs or YACs clones. 
Thus, multiple combinations of genes can be introduced onto artificial 
chromosomes and these combinations tested for functionality. In particular, it 
is contemplated. that multiple YACs or BACs can be stacked onto an artificial 
chromsomes, the BACs or YACs containing multiple genes of complex 

10 pathways or mutlipe genetic pathways. The BACs or YACs are typically 
selected based on genetic information available within the public domain, for 
example from the Arabidopsis Information Management System 
(http://aims.cps.msu.edu/aims/index.html) ortheinformation related tothe plant 
DNA sequences available from the Institute for Genomic Research 

15 (http://www.tigr.org) and other sites known to those skilled in the art. 
Alternatively, clones can be chosen at random and evaluated for functionality. 
It is contemplated that combinations providing a desired phenotype can be 
identified by isolation of the artificial chromosome containing the combination 
and analyzing the nature of the inserted cloned DNA. 

20 In another embodiment of the methods provided herein for discovering 

genes associated with plant traits, the artificial chromosome used to transfer 
plant DNA to a host cell for evaluation therein will contain large regions of plant 
DNA, in particular plant euchromatin, as a result of the process by which the 
artificial chromosome is produced. In particular, the artificial chromosome may 

25 be an amplification-based artificial chromosome, including, but not limited to: 
(1) a minichromosome arising from breakage of a dicentric chromosome, (2) an 
artificial chromosome containing one or more regions of repeating nucleic acid 
units wherein the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid, (3) an artificial chromosome 

30 containing one or more regions of repeating nucleic acid units wherein the 
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repeat region(s) is made up predominantly of euchromatic DNA or contains 
about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater than 
90% euchromatic DNA, (4) an artificial chromosome containing one or more 
regions of repeating nucleic acid units wherein the artificial chromosome is 
5 made up of substantially equivalent amounts of heterochromatin and 
euchromatin, (5) an artificial chromosome that containing one or more regions 
of repeating nucleic acid units having common nucleic acid sequences that 
represent euchromatic and heterochromatic nucleic acid and (6) a sausage-like 
structure that contains a portion or all of a euchromatin-containing arm of a 

10 plant chromosome. 

In these methods for discovering genes associated with plant traits, 
because the artificial chromosome used to transfer plant DNA to a host cell for 
evaluation therein is generated to already contain large amounts of plant DNA, 
in particular plant euchromatin, there is no need to introduce plant euchromatin 

1 5 into the artificial chromosomes, by homologous or site-specific recombination. 

4. Use of artificial chromosomes for preparation and screening of 
libraries 

Since large fragments of DNA can be incorporated into artificial 
chromosomes (ACs), they are well-suited for use as cloning vehicles that can 
20 accommodate entire genomes in the preparation of genomic DNA libraries, 
which then can be readily screened for functionality as described above or for 
specific gene sequences for further modification and study. For example, it is 
possible to use artificial chromosomes to prepare artificial chromosome libraries 
containing plant genomic DNA library useful in the identification and isolation 
25 of functional DNA components such as genes, centromeric DNA and telomerlc 
DNA from a variety of different species of plants. 

The following examples are included for illustrative purposes only and are 
not intended to limit the scope of the invention. 

Example 1 

30 Generation of Arabidopsis protoplasts 
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Plant protoplasts are typically generated from plant cells following 
standard techniques (for example, Maheshwarf et aL, Crit. Rev. Plant Sci. 
14: 1 49-1 78, 1 995; Ramulu et a/., Methods in Molecular Biology 111 227-242, 
1999). Typically plant protoplasts are prepared from fresh plant tissue, e.g., 
5 leaf, or can be prepared by converting cell suspension cultures to protoplasts 
by removal of the cell walls enzymatically. For production of Arabidopsis 
protoplasts, the methods of Karesh et aL (Plant Cell Reports 9: 575-578, 1 991) 
and Mathur et aL (Plant Cell Reports 74:21-226, 1995) were used to generate 
Arabidopsis suspension cultures by modifications thereof as described below. 
10 These cells were maintained in liquid culture and subcultured as required, 
usually between 7 and 10 days in culture. 

Establishment of suspension cultures 

Cell suspension cultures derived from root callus of Arabidopsis thaliana 
cv. Columbia, RLD and Landsburg I erecta'were used. Calli were induced from 
15 roots of 3 week-old seedlings on callus induction medium containing MS basic 
media (Murashige and Skoog (1962) Physiol. Plant 75:473-497) with 3% 
sucrose, 0.5mg/l napthalene acetic acid (NAA), 0.05 mg/l Kinetin (Sigman 
Aldrich Canada). The cell suspension cultures were grown from the calli in 
liquid callus induction medium at 22°C with shaking at 120 rpm. They were 
20 subcultured every 7 days. 

Generation of protoplasts 

One gram of 4-5 day-old suspension culture was incubated in 6 ml 
enzyme solution containing 1% Cellulase 'Onozuka' R-10 and 0.25% 
Macerozyme R-10 in 35 g/l CaCI 2 *2H 2 0 (Hartmann etal. (1998) Plant Mol. Biol. 

25 36:741 -754) and incubated at 22°C in the dark with shaking at 70 rpm for 1 5 
h. The protoplast mixture was poured through a 100//m nylon mesh sieve and 
centrifuged at 250xg for 5 min. The protoplasts were washed with 35 g/l 
CaCI 2 -2H 2 0 and resuspended in 10 ml floating medium containing B5 medium 
{Gamborg etal. (1968) Exp. Cell Res. 50:151-158) with 144 g/l sucrose and 1 

30 mg/l 2,4-dichlorophenoxyacetic acid (2,4-D). The protoplasts were centrifuged 
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at 80xg for 10 min, collected at the interface and used immediately for 
transfection. 

Example 2 

Generation of Tobacco Mesophyll Protoplasts 

5 Mesophyll protoplasts were generated from leaves of sterile plantlets of N. 
tabacum cv. Xanthi. The plantlets were grown aseptically on MSO medium (MS 
basal media, 3% sucrose, 0.05% morpholinoethanesulfonic acid (MES), 1.0 
mg/l benzyl adenine (BA), 0.1 mg/l NAA and 0.8% agar, pH 5.8) at 22°C under 
a 16/8 h photoperiod {see also Bilang et al. (1994) Plant Molecular Biology 

10 Manual A 7:1-6). Fully expanded leaves (2x4 cm) were cut in half, the main 
vein removed and the upper epidermis scored with parallel cuts. Leaf pieces 
were immersed in 6 ml enzyme solution containing 1.2% Cellulase 'Onozuka' 
FM0 and 0.4% Macerozyme R-10 in K4 medium (Nagy and Maliga (1976) Z. 
Pflanzenpysiol. 7S:453-455) and incubated at 22°C for 1 5 h without shaking. 

15 The protoplasts were purified by pouring through a 100//m nylon mesh sieve. 
Suspension of protoplasts was carefully overlayed with 1 ml W5 solution (Bilang 
etal. (1994) Plant Molecular Biology Manual >4 7:1-6) and centrifuged at 80xg 
for 10 min. Protoplasts were then resuspended in W5 solution at a density of 
1 x 1 0 6 protoplasts/ml and stored at 4°C for 1 to 2 hours prior to treatment, for 

20 example, DNA uptake or chromosome transfer. 

Example 3 

Production of Tobacco Protoplasts from Suspension Cultures 
Tobacco BY-2 protoplasts are prepared from suspension cultures according 
to the method of Nagata et al. [(1981) Molecular and General Genetics, 
25 754:161-165]. 

Example 4 

Generation of Brassica Hypocotyl Protoplasts 

Genotypes of Brassica napus, B. oleracea, B.juncea and B. carinata may 
be used to generate protoplasts. Seeds of Brassica napus were 
30 surface-sterilized (for 2 min with 70% ethanol, then for 20 min with 2.4% 
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sodium hypochlorite containing one drop of Tween 20 per 1 00 ml). Seeds were 
rinsed thoroughly with sterile distilled water and grown aseptically on 
autoclaved germination medium (half-strength basal Murashige and Skoog's 
medium (MS), 1% sucrose, 0.8% agar, pH 5.8). Unless otherwise indicated, 
5 the protoplast generation procedures were performed aseptically and solutions 
and media were filter-sterilized. Alternatively, protoplasts can be generated and 
cultured successfully from different explants using various protocol 
modifications (for example, Kao etal. (1991) Plant Science 75:63-72; Kao et 
al. (1990) Plant Cell Rep. S:31 1-31 5; Kao and Seguin-Swartz (1987) Plant Cell 
10 Tiss. Org. Cult. 70:79-90; Kao (1977) Mol. Gen. Genet. 750:225-230). 
Generation of Hypocotyl Protoplasts 

Hypocotyls were excised from 4 or 5 day-old seedlings grown aseptically 
in the dark with or without light exposure for a few hours prior to use. The 
explants were cut transversely into 2-5 mm pieces and incubated in enzyme 

15 solution (salts, vitamins and organic acids of Kao's medium (Kao (1977) MoL 
Gen. Genet. /50:225-230) / 0.4 g/l CaCl 2 -2H 2 0, 13% sucrose, 1% 
Cellulase'Onozuka R10', 0.1% Pectolyase Y23, pH 5.6) in petri dishes, in 
darkness, without agitation for 14-18 hours, then with agitation on a rotary 
shaker (ca. 50 rpm) for 15-30 min. 

20 The mixture was filtered through a 63 jum nylon screen into centrifuge 
tubes, and an equal volume of 17.5% sucrose was added to each tube. 
Following centrif ligation {ca. 100xg, 8 min), the protoplast band that formed at 
the top of each tube was collected. Protoplasts were washed 3 times by 
resuspension in wash solution [solution W5 of Menczel and Wolfe (1984, Plant 

25 Cell Rep 5:196-198) at a reduced strength (0.8X)J followed by centrif ugation 
at 100xg for 3-5 min and discarding the supernatant. 

Protoplasts were cultured in Kao's medium containing the salts, vitamins 
and organic acids with 30 g/l sucrose, 68.4 g/l glucose, 0.5 mg/l NAA, 0.5 mg/l 
BA, 0.5 mg/l 2,4-D, pH 5.7, at a density of 1 X 10 5 per ml and incubated at 

30 25°C, 16 h photoperiod, in dim fluorescent light (25 //Em" 2 s" 1 ). 
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After 5-8 days in culture, 1-1 .5 ml of feeder medium containing the above 
medium except with 55.8 g/l glucose instead of 68.4 g/l, were added to each 
dish, and the dishes were placed under brighter fluorescent light (50 //Em" 2 s' 1 ). 
At about 14 days, 1-2 ml of medium were removed from each dish, and 2-3 ml 
5 of feeder medium containing basal B5 medium (Gamborg etai. (1968) Exp. Cell 
Res. 50:151-158), 3% sucrose, 3.8% glucose, 0.5 mg/l BA, 0.5 mg/l NAA, and 
0.5 mg/l 2,4-D, pH 5.7, were added. At about 21 days, if microcolonies have 
not yet formed, the cultures can be fed with the last feeder medium except with 
2.2% glucose instead of 3.8%. Protoplast cultures can be washed when 
10 necessary by adding new feeder medium, gently swirling petri dishes, allowing 
cells to settle, removing most of the supernatant and adding fresh medium to 
the dishes. 

At 3-5 weeks, microcolonies were embedded with medium containing a 1 :1 
mixture of the last feeder medium and proliferation medium which contains the 
15 components of the feeder medium with 0.9% glucose and 1.6% agarose to 
make a concentration of 0.8% in the final mixture. Cultures were incubated as 
described above in bright fluorescent light (80-100/vEm 2 s' 1 ). After 10days-2 
weeks, green colonies were plated onto the regeneration medium. 

Example 5 

20 Preparation of a Transformation Vector Useful for the Induction of 

Plant Artificial Chromosome Formation 

Plant artificial chromosomes (PACs) can be generated by introducing 

nucleic acid, such as DNA, which can include an amplification-inducing DNA 

and/or a targeting DNA, for example rDIMA or lambda DNA, into a plant cell, 

25 allowing the cell to grow, and then identifying from among the resulting cells 
those that include a chromosome with a structure that is distinct from that of 
any chromosome that existed in the cell prior to introduction of the nucleic acid. 
The structure of a PAC reflects amplification of chromosomal DNA, for example, 
segmented, repeat region-containing and heterochromatic structures. It is also 

30 possible to select cells that contain structures that are precursors to PACs, for 
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example, chromosomes containing more than one centromere and/or fragments 
thereof, and culture and/or manipulate them to ultimately generate a PAC within 
the cell. 

In the method of generating PACs, the nucleic acid can be introduced 
5 into a variety of plant cells. The nucleic acid can include targeting DNA and/or 
a plant expressable DNA encoding one or multiple selectable markers [e.g., DNA 
encoding bialophos (bar) resistance) or scorable markers [e.g., DNA encoding 
GFP). Examples of targeting DNA include, but are not limited to, N. tabacum 
rDNA intergenic spacer sequence (IGS) and Arabidopsis rDNA such as the 1 8S, 

10 5.8S, 26S rDNA and/or the intergenic spacer sequence. The DNA can be 
introduced using a variety of methods, including, but not limited to 
Agrobacterium-vr\ed\ated methods, PEG-mediated DNA uptake and 
electroporation using, for example, standard procedures according toHartmann 
eta/ [{1998) Plant Molecular Biology 36:741]. The cell into which such DNA 

15 is introduced can be grown under selective conditions and can initially be grown 
under non-selective conditions and then transferred to selective media. The 
cells or protoplasts can be placed on plates containing a selection agent to 
grow, for example, individual calli. Resistant calli can be scored for scorable 
marker expression. Metaphase spreads of resistance cultures can be prepared, 

20 and the metaphase chromosomes examined by FISH analysis using specific 
probes in order to detect amplification of regions of the chromosomes. Cells 
that have artificial chromosomes with functioning centromeres or artificial 
chromosomal intermediate structures, including, but not limited to, dicentric 
chromosomes, formerly dicentric chromosomes, minichromosomes, 

25 heterochromatin structures (e.g. sausage chromosomes), and stable self- 
replicating artificial chromosomal intermediates as described herein, are 
identified and cultured. In particular, the cells containing self -replicating artificial 
chromosomes are identified. 
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The DNA introduced into a plant cell for the generation of PACs can be 
in any form, including in the form of a vector. An exemplary vector for use in 
methods of generating PACs can be prepared as follows. 

For the production of artificial chromosomes, plant transformation 
5 vectors, as exemplified by pAglla and pAgllb, containing a selectable marker, 
a targeting sequence, and ascorable marker were constructed using procedures 
well known in the art to combine the various fragments. The vectors can be 
prepared using vector pAg1 as a base vector and inserting the following DNA 
fragments into pAg1 : DNA encoding /?-glucoronidase under the control of the 

10 nopaline synthase (NOS) promoter fragment and flanked at the 3' end by the 
NOS terminator fragment, a fragment of mouse satellite DNA and an N. 
tabacum rDNA intergenic spacer sequence (IGS). In constructing plant 
transformation vectors, vector pAg2 can also be used as the base vector. 
1. Construction of pAG1 

15 Vector pAg1 (SEQ. ID. NO: 1; see Figure 1) is a derivative of the 

CAMBIA vector named pCambia 3300 (Center for the Application of Molecular 
Biology to International Agriculture, i.e., CAMBIA, Canberra, Australia; 
www.cambia.org), which is a modified version of vector pCambia 1300 to 
which has been added DNA from the bar gene confering resistance to 

20 phosphinothricin. The nucleotide sequence of pCambia 3300 is provided in 
SEQ. ID. NO: 2. pCambia 3300 also contains a lacZ alpha sequence containing 
a polylinker region. 

pAg1 was constructed by inserting two new functional DNA fragments 
into the polylinker of pCambia 3300: one sequence containing an attB site and 

25 a promoterless zeomycin resistance-encoding DNA flanked at the 3' end by a 
SV40 polyA signal sequence, and a second sequence containing DNA from the 
hygromycin resistance gene (hygromycin phosphotransferase) confering 
resistance to hygromycin for selection in plants. Although the zeomycin-SV40 
polyA signal fusion is not expected to provide the basis for zeomycin selection 

30 in plant cells, it can be activated in mammalian cells by insertion of a functional 
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promoter element into the attB site by site-specific recombination catalyzed by 
the Lambda att integrase. Thus, the inclusion of the attB-zeomycin sequences 
allows for evaluation of functionality of plant artificial chromosomes in 
mammalian cells by activation of the zeomycin resistance-encoding DNA, and 
5 provides an att site for further insertion of new DNA sequences into plant 
artificial chromosomes formed as a result of using pAg1 for plant 
transformation. The second functional DNA fragment allows for selection of 
plant cells with hygromycin. Thus, pAg1 contains DNA from the bar gene 
conf ering resistance to phosphinothricin, DNA from the hygromycin resistance 

10 gene, both resistance-encoding DNAs under the control of a separate 
cauliflower mosaic virus (CaMV) 35S promoter, and the attB-promoterless 
zeomycin resistance-encoding DNA. 

pAg1 is a binary vector containing Agrobacterium right and leftT-DNA 
border sequences for use in Agrobacterium-medlated transformation of plant 

1 5 cells or protoplasts with the DNA located between the border sequences. pAg1 
also contains the pBR322 Ori for replication in E.coli. pAg1 was constructed 
by ligating Hfnd\\\fPst\-d\gested p3300attBZeo with Mm/lll/Psfl-digested 
pBSCaMV35SHyg as follows (see Figure 2). 
a. Generation of p3300attBZeo 

20 Plasmid pCambia 3300 was digested with Pstl/EcH 36 II and Jigated with 

Psfl/Sft/l-digested pLITattBZeo (the nucleotide sequence of pLITattBZeo is 
provided in SEQ. ID. NO; 1 9 to generate p3300attBZeo which contains an attB 
site, a promoterless zeomycin resistance-encoding DNA flanked at the 3' end 
by a SV40 polyA signal, and a reconstructed Pstl site. 

25 b. Generation of pBSCaMV35SHyg 

A DNA fragment containing DNA encoding hygromycin 
phosphotransferase flanked by the CaMV 35S promoter and the CaMV 35S 
polyA signal sequence was obtained by PCR amplif ication of plasmid pCambia 
1302 (GenBank Accession No. AF234298 and SEQ. ID. NO: 3). The primers 

30 used in the amplification reaction were as follows: 
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CaMV35SpoiyA: 

5'-CTGAATTAACGCCGAATTAATTCGGGGGATCTG-3' SEQ. ID. NO: 4 
CaMV35Spr: 

5'-CTAGAGCAGCTTGCCAACATGGTGGAGCA-3' SEQ. ID. NO: 5 
5 The 2 1 OO-bp PCR fragment was ligated with EcoRV-digested pBluescript II SK + 
(Stratagene, La Jolla, CA, U.S.A.) to generate pBSCaMV35SHyg. 
c. Generation of pAgl 

To generate pAgl, pBSCaMV35SHyg was digested w\XhHind\l\/Pst\ and 
ligated with M/?cflll/Psrl-digested p3300attBZeo. Thus, pAgl contains the 

1 0 pCambia 3300 backbone with DNA conferring resistance to phophinothricin and 
hygromycin under the control of separate CaMV 35S promoters, an attB- 
promoterless zeomycin resistance-encoding DNA recombination cassette and 
unique sites for adding additional markers, e.g., DNA encoding GFP. The attB 
site facilitates the addition of new DNA sequences to plant or animal, e.g., 

15 mammalian, artificial chromosomes, including PACs formed as a result of using 
the pAgl vector, or derivatives thereof, in the production of PACs. The attB 
site provides a convenient site for recombinase-mediated insertion of DNAs 
containing a homologous att site. 
2. pAG2 

20 The vector pAg2 (SEQ. ID. NO: 6; see Figure 3) is a derivative of vector 

pAgl formed by adding DNA encoding a green fluorescent protein (GFP), under 
the control of a NOS promoter and flanked at the 3' end by a NOS polyA signal, 
to pAgl . pAg2 was constructed as follows (see Figure 4). A DNA fragment 
containing the NOS promoter was obtained by digestion of pGEM-T-NOS, or 

25 pGEMEasyNOS (SEQ. ID. NO: 7), containing the NOS promoter in the cloning 
vector pGEM-T-Easy (Promega Biotech, Madison, Wl, U.S.A.), with Xba\INco\ 
and was ligated to an Xba\INco\ fragment of pCambia 1302 containing DNA 
encoding GFP (without the CaMV 35S promoter) to generate p1302NOS (SEQ. 
ID. NO: 8) containing GFP-encoding DNA in operable association with the NOS 

30 promoter. Plasmid p1302NOS was digested with Sma\IBsi\N\ to yield a 
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f ragment containing the NOS promoter and GFP-encoding DNA. The fragment 
was ligated with P/nel/Ss/WI-digested pAg1 to generate pAg2. Thus, pAg2 
contains DNA from the bar gene conf ering resistance to phosphinothricin, DNA 
conferring resistance to hygromycin, both resistance-encoding DNAs under the 
5 controf of a cauliflower mosaic virus 35S promoter, DNA encoding kanamycin 
resistance, a GFP gene under the control of a NOS promoter and the attB- 
zeomycin resistance-encoding DNA. One of skill in the art will appreciate that 
other fragments can be used to generate the pAg1 and pAg2 derivatives and 
that other heterlogous DNA can be incorporated into pAg1 and pAg2 derivatives 

10 using methods well known in the art. 

3. pAglla and pAgllb transformation vectors 

Vectors pAglla and pAgllb were constructed by inserting the following 
DNA fragments into pAg1: DNA encoding j9-glucoronidase, the nopaline 
synthase terminator fragment, the nopaline synthase (NOS) promoter fragment, 

15 a fragment of mouse satellite DNA and an N. tabacum rDNA intergenic spacer 
sequence (IGS). The construction of pAglla and pAgllb was as follows (see 
Figure 5). 

An N. tabacum rDNA intergenic spacer (IGS) sequence (SEQ. ID. NO: 9); 
see also GenBank Accession No. Y08422; see also Borysyuk et al. (2000) 

20 Nature Biotechnology 75:1303-1306; Borysyuk et al. (1997) Plant Mol. 
£/o/.35:655-660; U.S. Patent Nos. 6,100,092 and 6,355,860) was obtained by 
PCR amplification of tobacco genomic DNA. The IGS can be used as a 
targeting sequence by virtue of its homology to tobacco rDNA genes; the 
sequence is also an amplification promoter sequence in plants. This fragment 

25 was amplified using standard PCR conditions {e.g., as described by Promega 
Biotech, Madison, Wl, U.S.A.) from tobacco genomic DNA using the primers 
shown below: 
NTIGS-FI 

5'- GTG CTA GCC AAT GTT TAA CAA GAT G- 3' (SEQ ID No. 10) and 
30 NTIGS-RI 
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5'-ATG TCT TAA AAA AAA AAA CCC AAQ TGA C- 3' (SEQ ID No. 1 1) 
Following amplification, the fragment was cloned into pGEM-T Easy to give 
pIGS-l. 

A fragment of mouse satellite DNA (Msatl fragment; GenBank Accession 
5 No. V00846; and SEQ ID No. 1 2) was amplified via PCR from pSAT-1 using the 
following primers: 
MSAT-F1 

5'- AAT ACC GCG GAA GCT TGA CCT GGA ATA TCG C -3'(SEQ ID No. 13) 
and 

10 MSAT-Ri 

5'-ATA ACC GCG GAG TCC TTC AGT GTG CA T- 3' (SEQ ID No. 14) 
This amplification added a Sacll and a Hind\\\ site at the 5'end and a SacW site 
at the 3' end of the PCR fragment. This fragment was then cloned into the 
SacW site in plGS-1 to give pMIGS-1 , providing a eukaryotic centromere-specific 

15 DNA and a convenient DNA sequence for detection via FISH. 

A functional marker gene containing a NOS-promoter:GUS:NOS 
terminator fusion was then constructed containing the NOS promoter (GenBank 
Accession No. U09365; SEQ ID No. 15), E. coli ^-glucuronidase coding 
sequence (from the GUS gene; GenBank Accession No. S69414; and SEQ ID 

20 No. 16), and the nopaline synthase terminator sequence (GenBank Accession 
No. U09365; SEQ ID No. 18). The NOS promoter in pGEM-T-NOS was added 
to a promoterless GUS gene in pBlueScript (Stratagene, La Jolla, CA, U.S.A.) 
using Not\ISpe\ to form pNGN-1, which has the NOS promoter in the opposite 
orientation relative to the GUS gene. 

25 pMIGS-1 was digested with Not\ISpe\ to yield a fragment containing the 

mouse major satellite DNA and the tobacco IGS which was then added to Not\- 
digested pNGN-1 to yield pNGN-2. The NOS promoter was then re-oriented to 
provide a functional GUS gene, yielding pNGN-3, by digestion and religation 
with Spe\. Plasmid pNGN-3 was then digested with HincM, and the H'md\\\ 

30 fragment containing the ^-glucuronidase coding sequence and the rDNA 
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intergenic spacer, along with the Msat sequence, was added to pAG-1 to form 
pAglla, using the unique Hind\\\ site in pAg1 located near the right T-DNA 
border of pAg1, within the T-DNA region. 

Another plasmid vector, referred to as pAgllb, was also recovered, which 
5 contained the inserted Hind\\\ fragment in the opposite orientation relative to 
that observed in pAglla. Thus, pAglla and pAgllb differ only in the orientation 
of the Hind\\\ fragment containing the mouse major satellite sequence, the GUS 
DNA sequence and the IGS sequence {see Figure 6). The nucleotide sequence 
of pAglla is provided in SEQ. ID. NO: 21 . 

10 Vectors pAg1, pAg2, pAglla and pAgllb, as well as similarly designed 

vectors containing a recombination site and a promoter (e.g., plant or animal 
promoter), and possibly other regulatory sequences, in operable association with 
DNA encoding a protein or other product for the expression in a host cell, such 
as a plant or animal cell, can be used in the transfer of any protein {or other 

1 5 product)-encoding nucleic acid of interest into a cell for expression thereof. For 
example, any protein (or other product)-encoding nucleic acid of interest (in 
operable association with transcriptional regulatory suitable for use in a 
particular host cell) can be inserted into any of the vectors pAg1 , pAg2, pAglla 
and pAgllb and thereby incorporated into a plant, animal or other artificial 

20 chromosome, particularly a platform artificial chromosome ACes, as desribed 
herein. 

Example 6 

Agrobacter/um-Med'iated Transformation of Plant Cells 

Plant cells were transformed via Agrobacterium-vnediated transformation 
25 according to standard procedures {see, for example, Horsch eta/. (1 988) Plant 
Molecular Biology Manual, AS'A-9, Kluwer Academic Publisher, Dordrecht, 
Belgium). Brief ly , Agrobacterium strain GV 3101/pMP90 (see Koncz and Schell 
(1986) Molecular and General Genetics 204:383-396) was transformed with 
pAglla and pAgllb (see Example 5) by heat shock, and the plasmid integrity of 
30 pAglla and pAgllb after transformation was verified by Hind\\\ digest pattern. 
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pAglla/pMP90 or pAgllb/pMP90 were cultured in 5 ml AB minimum medium 
(Horsch etal. (1 988) Plant Molecular Biology Manual, v45:1-9, Kluwer Academic 
Publisher, Dordrecht, Belgium) containing 25 //g/ml kanamycin and 25 //g/ml 
gentamycin at 28°C for two days. 
5 Leaf disks of tobacco and Arabidopsis and root segments of Arabidopsis 

were prepared as follows: tobacco leaves from 3 to 4 week-old explants were 
cut into 1 cm in diameter, and Arabidopsis leaves were taken from 3 week-old 
seedlings and transversely cut in two halves. Roots of 3 week-old Arabidopsis 
were excised into segments of 1 cm in length. Cocultivation was carried out 

10 by immersing leaf disks or root segments in bacterial culture for 2 minutes and 
then transferring the infected tissues to culture medium without antibiotics for 
2 days at 22 °C for 16-hours/day under cool white fluorescent light. The leaf 
disks of tobacco and Arabidopsis were cultured on MS 104 medium (MS, 3% 
sucrose, 0.05% MES, 1 .0 mg/l BA, 0.1 mg/l NAA and 0.8% agar, pH 5.8) and 

15 root segments on callus-inducing medium, CIM 0.5/0.05 (B5, 2% glucose, 
0.05% MES, 0.5 mg/l 2,4-D, 0.05 mg/l kinetin and 0.8% agar, pH 5.8). 

The transformed leaf disks and root segments were then transferred to 
selection medium of MS104 or CIM 0.5/0.05, respectively, containing 20 mg/l 
hygromycin and 300 mg/l Timentin for the elimination of Agrobacterium. The 

20 selection medium was refreshed every two weeks and green shoots 
regenerated. Plants were analyzed for the expression of the DNA encoding GUS 
by standard histochemical and fluorescent assays and evidence of amplification 
of the inserted DNA by quantitative PCR. Numerous plants were obtained that 
expressed high levels of GUS, and multiple copies of the GUS gene were 

25 observed by Fluorescent In Situ Hybridization (FISH) and PCR analysis. Thus, 
amplification the chromosomal regions containing the inserted DNA was 
observed. One of skill in the art will appreciate that GUS expression, or the 
expression of any other gene, can be assessed using methods well known in the 
art. 

30 Example 7 
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Transfection and culture of Arabidopsis protoplasts 

E coli strain Stb14 (Gibco Life Sciences) was transformed with pAglla, 
pAgllb, and one of two targeting plasmids containing therDNA repeat sequence 
from Arabidopsis (plasmid pJHD-14A orthe26SrDNAfromv4ra£/c/ops/s plasmid 
5 pJHD2-19A, as described by Doelling et al. [(1993) Proc. Natl. Acad. Sci. 
U.S.A. 90:7528-7532]) via electroporation according to standard procedures. 
A single colony was grown up in 250 ml LB medium containing 50 /yg/ml 
kanamycin (for selection based on the kanamycin resistance-encoding DNA in 
pAglla and pAgllb) or 50 //g/ml ampicillin (for selection based on the ampicillin 

10 resistance-encoding DNA in pJHD-14A & pJHD2-19A) and cultured at 30?C 
with shaking at 225 rpm for 16 hours. The plasmids were isolated according to 
standard procedures well known in the art. The structural integrity of the 
plasmids was checked by restriction digestion pattern, and the plasmids were 
linearized with restriction enzymes. Plasmids were sterilized with chloroform 

15 and 70% ethanol before use for transfection. 

Arabidopsis protoplasts were resuspended in the culture medium (see 
Example 1) at a density of 2 x 10 6 protoplasts/ml. A 300 p\ protoplast 
suspension was pipetted into a 1 5 ml tube, and 30 jj\ of plasmid (pAglla or 
pAgllb) and targeting DNA (pJHD-14A or pJHD2-19A) was added containing 

20 10//g plasmid and 100 pg targeting sequence followed immediately by slowly 
adding 300 p\ of 10% PEG. The targeting plasmids were included in the 
transfection procedure in order ensure that the amount of rDNA targeting DNA 
(i.e., tobacco rDNA from pAglla or b and Arabidopsis DNA from the targeting 
vectors) was sufficient to effect recombination of the introduced DNA at a 

25 homologous site in an Arabidopsis chromosome. DNA was typically used in a 
ratio of 10:1, targeting DNA (pJHD-14A or pJDH2-19A, or Lambda DNA) to 
plasmid DNA (pAglla or pAgllb, or a selectable marker plasmid), or in a ratio of 
5:1 . Generally, the number of base pairs of targeting DNA to be sufficient for 
insertion into a plant chromosome is at least about 50 bp, or about 60 bp, or 

30 about 70 bp, or about 80 bp, or about 90 bp, or about 1 00 bp, or about 1 50 
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bp, or about 200 bp, or about 300 bp, or about 400 bp, or about 500 bp, or 
about 600 bp, or about 700 bp, or about 800 bp, or about 900 bp, or about 1 
kb, or about 2 kb or about 3 kb, or about 4 kb, or about 5 kb, or about 6 kb, 
or about 7 kb, or about 8 kb, or about 9 kb, or about 10 kb or more. The 
5 amount and length of targeting DNA sufficient to effect introduction into a 
chromosome can be determined empirically and can vary for different plant 
species. 

The mixture was shaken gently, and immediately 300 //I of 1 0% PEG 
solution was added slowly with gentle shaking. The protoplast mixture was 

10 incubated at 22°C for 10-15 min with several cycles of gentle shaking. DNA 
uptake was quenched by the addition of 5 ml 72.4 g/l Ca(!M0 3 ) 2 . The 
protoplasts were then centrifuged at 80xg for 7 min and resuspended in culture 
medium. For selection, 10 to 40 mg/l hygromycin was added to protoplast 
cultures 1 4 days af tertransfection, and the culture medium was refreshed every 

15 7 days. The protoplast cultures could also be selected after embedding in 0.6% 
agarose by transferring to a culture medium containing 20 mg/l hygromycin. The 
cultures were incubated for 14 days or longer at 22°C. 

The Arabidopsis protoplasts were analyzed for the presence and 
expression of the DNA encoding GUS. Recovered microcalli strongly expressed 

20 GUS and were resistant to selective agents, indicating amplification of the 
inserted DNA. Alternatively, the transfection of Arabidopsis protoplasts can 
be conducted without using targeting DNA sequences since pAglla and pAgllb 
include a region of rDNA (i.e. the tobacco rDNA IGS) that can act as a targeting 
sequence as long as a sufficient amount of pAglla/b plasmid is used in the 

25 transfection procedure. Example 8 

Transfection and Culture of Tobacco Protoplasts 
As described in Example 7, E. co/i strain StbI4 was transformed with pAglla, 
pAgllb, pJHD-14A (targeting DNA) and pJHD2-19A (targeting DNA) via 
electroporation, and plasmid DNA was recovered and linearized with restriction 
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enzymes. Plasmids were sterilized with chloroform and 70% ethanol before use 
for transfection. 

The tobacco protoplasts (see Examples 2 and 3) were resuspended in the 
culture medium (see Example 2) at a density of 2 x 10 6 protoplasts/ml. A 300 
5 jj\ protoplast suspension was pipetted into a 15 ml tube, and 30 jjI of plasmid 
and targeting DNA was added as described in Example 7. The mixture was 
shaken gently, and immediately 300 //I of 10% PEG solution was added slowly 
with gentle shaking. The tobacco protoplast mixture was incubated at 22°C 
for 10-15 min with several cycles of gentle shaking. DNA uptake was 

10 quenched by the addition of 5 ml 72.4 g/L Ca(N0 3 ) 2 . The protoplasts were then 
centrifuged at 80xg for 7 min and resuspended in culture medium. 

The recovery of viable tobacco protoplasts following DNA uptake ranged 
from 65-75% following treatment. Typically greater than 35% of the 
protoplasts initiated cell division within 7 days of treatment. Protoplast cells 

15 were analyzed for gene expression (in this case for the expression of the 
reporter DNA GUS, but alternatively, the expression of other genes can be 
monitored). Between 4% and 6% of the recovered cells exhibited GUS 
expression. 

The protoplasts were subject to selection procedures to recover 
20 transformed cells. For selection of tobacco cells, 10 to 40 mg/l hygromycin 
was added to protoplast cultures 10-14 days after transfection, and the culture 
medium was refreshed every 7 days. Leaf disc selection was performed in the 
presence of 40 mg/l hygromycin. Transformed microcalli were recovered and 
analyzed for the expression of the GUS reporter gene. GUS positive calli were 
25 isolated and subjected to FISH analysis (see Example 13). Plant cells that 
exhibited amplification of the inserted DNA were identified. 

Example 9 

Transfection and Culture of Brassica Protoplasts 

Brassica protoplasts (see Example 4), following the final washing step 
30 after filtering through a 63 //m nylon screen and centrifugation, are collected 
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and used for DNA transfection as described in Example 8. Brassica protoplast 
cultures following DNA uptake or transformation by Agrobacterium can be 
selected with either hygromycin or glufosinate ammonium in liquid culture or in 
embedded semi-solid cultures. The effective concentration of hygromycin is 10 
5 to 40 mg/l for 2 to 4 weeks or continuously, whereas that for glufosinate 
ammonium is 2 to 60 mg/l for 5 days to 2 weeks. Selection can impede growth, 
and additional transfers to similar media may be required. 

Example 10 
Plant Regeneration from Brassica Protoplasts 

10 Colonies of Brassica protoplasts (1 mm or larger in diameter) are plated 

onto regeneration medium (basal Murashige and Skoog's medium, 1 % sucrose, 
2 mg/l BA, 0.01 mg/l NAA, 0.8% agarose, pH 5.6). Cultures are incubated 
under the conditions described in Example 4. Cultures are transferred onto 
fresh regeneration medium every 2 weeks. Regenerated shoots are transferred 

15 onto autoclaved rooting medium (basal Murashige and Skoog's medium, 1% 
sucrose, 0.1 mg/l NAA, 0.8% agar, pH 5.8) and incubated under dim 
fluorescent light (25 //Em 2 s" 1 ). Plantlets are potted in a soil-less mix (for 
example, Terra-lite Redi-Earth, W.R. Grace & Co., Canada Ltd., Ajax, Ontario) 
containing fertilizer (Nutricote 1414-14 type 100, Plant Products Co. Ltd, 

20 Brampton, Ontario) and grown in a growth room {20°C/15°C, 16 h 
photoperiod, 100-1 40 /yEm -2 s* 1 ) with fluorescent and incandescent light at soil 
level. Plantlets are covered with transparent plastic cups for one week to allow 
for acclimatization. 

Example 1 1 

25 Isolation of Nuclei from Protoplasts 

To facilitate analysis, plant cells can be subjected to nuclei isolation, and 
the isolated nuclei can be analyzed by FISH or PCR. To isolate the nuclei, 
protoplast calli were reprotoplasted according to the procedure of Mathur ef a/, 
with modifications (see Mathur ef a/. Plant Cell Report (1995) 14: 221-226). 
30 The protoplast calli were digested with 1.2% Cellulase 'Onozuka' R-10 and 
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0.4% w/v Macerozyme R-10 in nuclei isolation buffer (10 mM MES-pH 5.5, 
0.2M sucrose, 2.5 mM EDTA, 2.5 mM DTT, 0.1 mM spermine, 10 mM NaCI, 
10 mM KCI and 0.15% Triton X-100) for 3 hours. After centrifugation at 80 
x g for 1 0 minutes, the pellets of protoplasts were resuspended in hypertonic 
5 buffer of 1 2.5% W5 solution (Hinnisdaels et al. (1994) Plant Molecular Biology 
Manual <?2:1-13, Kluwer Academic Publisher, Belgium) for 10 minutes. To 
promote disruption of protoplasts, the protoplast suspension wasforced through 
a syringe needle four times. The disrupted protoplasts were filtered through 5 
//m meshes to remove debris and centrifuged at 200 x g for 10 min. By 

10 repeated washing of the pellet in a nuclei isolation buffer containing 
phenylmethylsulfonylfluoride (PMSF) and centrifugation at 200 x g for 10 
minutes, nuclei were collected as a white pellet freed from cytoplasm 
contamination and cellular debris. Samples were fixed in 3:1 methanohglacial 
acetic acid and were analyzed by FISH. 

15 Example 12 

Mitotic Arrest of Plant Cells for Detection of Amplification and 
Artificial Chromosome Formation 

In general, plant cells or protoplasts are typically cultured for two or more 

generations prior to mitotic arrest. Typically, 5//g/ml colchicine is added to the 

20 cultures for 12 hours to accumulate mitotic plant cells. The mitotic cells are 
harvested by gentle centrifugation. Alternatively, plant cells (grown on plastic 
or in suspension) can be arrested in different stages of the cell cycle with 
chemical agents other than colchicine, such as, but not limited to, hydroxyurea, 
vinblastine, colcemid or aphidicolin or through the deprivation of nutrients, 

25 hormones, or growth factors. Chemical agents that arrest the cells in stages 
other than mitosis, such as, but not limited to, hydroxyurea and aphidicolin, are 
used to synchronize the cycles of all cells in the population and are then 
removed from the cell medium to allow the cells to proceed, more or less 
simultaneously, to mitosis at which time they can be harvested to disperse the 

30 chromosomes. 
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Example 13 

Detection of Amplification and Artificial Chromosome Formation by 
Fluorescence in situ hybridization (FISH) 

A variety of plant cells can analyzed by fluorescence in situ hybridization 

5 (FISH) methods (Fransz etal. (1996) Plant J. 9:421-430; Fransz eta/. (1998) 

Plant J. 73:867-876; Wilkes etal. (1995) Chromosome Research 3:466-472; 

Busch etal. (1994) Chromosome Research 2:15-20; Nkongolo (1993) Genome 

35:701-705; Leitch etal. (1994) Methods in Molecular Biology 23:177-185; 

Murata et aL (1997) Plant J. /2:31-37) to identify amplification events and 

10 artificial chromosome formation. 

FISH is used to detect specific DNA sequences on chromosomes, in 
particular to detect regions of plant chromosomes that have undergone 
amplification as a result of the introduction of heterologous DNA as described 
herein, or to detect artificial chromosome formation in plant cells. FISH 

15 chromosome spreads of Arabidopsis and tobacco plant cells into which 
heterologous DNA has been introduced are generated using colchicine or similar 
cell cycle arresting agents and various DNA probes (e.g. rDNA probe, Lambda 
DNA probe, selectable marker probe). The cells are analyzed for the presence 
of amplified regions of chromosomes, in particular amplification of the rDNA 

20 regions, and those cells exhibiting amplification are further cultured and 
analyzed for the formation of artificial chromosomes. 

The chromosomes of plant cells subjected to introduction of heterologous 
DNA and growth to generate artificial chromosomes can also be analyzed by 
scanning electron microscopy. Preparation of mitotic chromosomes for 

25 scanning electron microscopy can be performed using methods known in the 
art (see, e.g., Sumner (1991) Chromosome 700:410-418). The chromosomes 
can be observed, for example, with a Hitachi S-800 field emission scanning 
electron microscope operated with an accelerating voltage of 25kV. 
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Example 14 

Detection of Amplification and Artificial Chromosome Formation by 
Idu Labeling of Chromosomes 

The structure. of the chromosomes in plant cells can be analyzed by labeling 

5 the chromosomes with iododeoxyuridine (IdU), or other nucleotide analog, and 

using an IdU-specif ic antibody to visualize the chromosome structure. Plant cell 

cultures selected following introduction of heterologous DNA are labeled with 

IdU following standard protocols (Fujishige and Taniguchi (1998) Chromosome 

Research 5;61 1-619; Yanpaisan etaL (1998) Biotechnology and Bioengineering, 

10 55:51 5-528; Trick and Bates (1 996) Plant Cell Reports, 75:986-990; Binarova 
et af. (1993) Theoretical and Applied Genetics, 37:9-16; Wang et al. (1991) 
Journal of Plant Physiology, 733:200-203). Plant cells in culture, typically 
suspension culture, are used. A series of sub-cultures are initiated, and IdU 
labeling is performed as described above. Cells are allowed to incorporate IdU 

15 for up to a week, depending on the doubling time of the culture. Labeled 
chromosomes can be detected in plant cells (Fujishige and Taniguchi (1998) 
Chromosome Research 6:611-619; Binarova et al. (1993) Theoretical and 
Applied Genetics 37:9-16) and in mammalian cells (Gratzner and Leif (1981) 
Cytometry 7:385-393) using procedures well known in the art. IdU-labeled 

20 chromosomes are detected by immunocytochemical techniques. An anti-ldU 
fluorescein isothiocyanate (FITC)-conjugated B44 clone antibody (Becton 
Dickinson) is used to bind the IdU-DNA adduct in the DNA and is detected by 
fluorescence microscopy (490 nm excitation, 519 nm emission). Analysis of 
labeled chromosomes reveals the presence of amplified DNA regions and the 

25 formation of artificial chromosomes. 

Example 15 

Isolation of Metaphase Chromosomes from Protoplasts 

Artificial chromosomes, once detected in plant cells, may be isolated for 
transfer to other organisms and in particular other plant species. Several 
30 procedures may be used to isolate metaphase chromosomes from mitotic— 
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arrested plant cells, including, but not limited to, a polyarnine-based buffer 
system (Cram eta/. (1990) Methods in Cell Biology 53:377-3821), a modified 
hexylene glycol buffer system (Hadlaczky et aL (1982) Chromosome 
85:643-65), a magnesium sulfate buffer system {Van den Engh et aL (1988) 
5 Cytometry S:266-270 and Van den Engh et aL (1984) Cytometry 5:1 08), an 
acetic acid fixation buffer system (Stoehr et aL (1982) Histochemistry 
74:57-61), and a technique utilizing hypotonic KCI and propidium iodide (Cram 
etaL (1994) XVII meeting of the International Society for Analytical Cytology, 
October 1 6-21 , Tutorial IV Chromosome Analysis and Sorting with Commerical 

10 Flow Cytometers; Cram eta/. (1 990) Methods in Ceil Biology 33:376; de Jong 
eta/. (1999) Cytometry 35:129-133). 

In an exemplary procedure, a hexylene glycol buffer is used to isolate plant 
chromosomes from mitotic-arrested plant cells that have been converted to 
protoplasts (Hadlaczky et aL (1982) Chromosoma 55:643-659). Chromosomes 

1 5 are isolated from about 1 0 6 mitotic cells re-suspended in a glycine-hexylene 
glycol buffer (100 mM glycine, 1 % hexylene glycol, pH 8.4-8.6, adjusted with 
a solution of saturated Ca(OH) 2 ) supplemented with 0.1% Triton X-100 (GHT 
buffer). The cells are incubated for 10 minutes at 37°C, and the chromosomes 
are purified by differential centrifugation to pellet the nuclei (200xg for 20 min) 

20 and sucrose gradient centrifugation (5-30% sucrose, 5600xg for 60 min, 
0-4°C). To avoid proteolytic degradation of chromosomal proteins, 1 mMPMSF 
(phenylmethylsulfonylfluoride) is used in the presence of 1 % isopropyl alcohol. 
The proteins can be extracted from the isolated chromosomes using dextran 
sulfate-heparin (DSH) extraction, and the chromosomes can be visualized via 

25 electron microscopy using techniques known in the art (Hadlaczky etaL (1 982) 
Chromosoma (BerL) 56:643-659; Hadlaczky etaL (1981) Chromosoma (BerlJ 
57:537-555). Additionally, modifications of these procedures, including, but 
not limited to, modification of the buffer composition (Carrano et aL (1979) 
Proc. Natl. Acad. ScL U.S.A. 76: 1 382-1 384) and variation of the centrifugation 
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time or speed, to accommodate different plant species can be implemented by 
any skilled artisan. 

Example 16 

Transfer of Artificial Chromosomes into Plant Cells: Transfer of 
5 Mammalian Artificial Chromosomes into a Dicot Plant: Arabidopsis 

One method of delivery of mammalian artificial chromosomes (MACs) into 

plant cells is the formation of microcells containing murine MACs and the 

CaP0 4 -mediated uptake or the PEG-mediated fusion of these microcells with 

plant protoplasts. In this example, microcells and plant protoplasts, such as but 

10 not limited to tobacco and Arabidopsis protoplasts, were mixed (in a series of 
25:1, 10:1, 5:1, or 2:1 microcells:protoplasts ratio) and fusion was observed. 
Protocols for the formation of microcells are known in the art and are described, 
for example, in US Patent Nos. 5,240,840, 4,806,476 and 5,298,429 and in 
Fournier Proc. Natt. Acad. ScL U.S.A. (1981) 75:6349-6353 and Lambert etaL 

15 Proc. Natl. Acad. ScL U.S.A. (1991) 88: 5907-5912. The murine microcells 
can be labeled with Idu or the IVlACs stained with a specific dye such as, but 
not limited to, e.g., propidium iodide or DAPI, prior to fusion with plant 
protoplasts including, but not limited to, Arabidopsis and tobacco protoplasts, 
to facilitate detection of the presence of IVlACs in the protoplasts. 

20 In this example, MACs were introduced into Arabidopsis cells using 

microcell-PEG mediated fusion. Microcells were. formed from murine cells 
containing an artificial chromosome (see U.S. Patent No. 6,077,697) and were 
fused with freshly prepared Arabidopsis protoplasts in a ratio of 10:1, 
microcells to protoplasts. Fusion occurred in the presence of 25% PEG 6000, 

25 204 mM CaCI 2 , pH 6.9 within the first 5 minutes of mixing. Typically less than 
about one minute of mixing is required to observe fusion between microcells 
and protoplasts. Fused cells were washed with 240 mM CaCI 2 , then floated on 
top of a solution of 204mM sucrose in B5 salts. Cells were then transferred to 
cell suspension culture media (MS, 87mM sucrose, 2.7 pM napthalene acetic 

30 acid, 0.23 pM kinetin, pH 5.8). Empirical observations can be used to 
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determine the optimal concentration and composition of PEG and the 
concentration of calcium that provides the highest degree of fusion with the 
least toxicity. 

Fused protoplasts were allowed to grow for one or more generations. 
5 The presence of a mouse chromosomal sequence, including MACs, was 
demonstrated by southern hybridization with MAC probes, by FISH analysis and 
by PCR analysis using, for example, satellite sequences known to exist on the 
MAC chromosome. Thus, the mouse sequences were detected in the 
Arabidopsis protoplasts. 

10 To further demonstrate the transfer of mouse chromosomal sequence to 

Arabidopsis protoplasts, Arabidopsis plant cell nuclei were isolated according 
to Example 1 1 and were subjected to FISH analysis according to Example 1 3 # 
using the mouse major satellite DNA (SEQ ID No. 12). A portion of the nuclei 
contained a significant signal using the mouse major satellite DNA, indicating 

15 successful transfer of at least a mouse chromosome and/or MAC to the 
Arabidopsis nuclei. 

Similarly, PACs may be introduced into Arabidopsis protoplasts using 
PEG- and/or calcium-mediated fusion procedures. Generation of 
microprotoplasts and protoplasts can be conducted as described, for example, 

20 in Example 1. Microprotoplasts formed from plant cells containing a plant 
artificial chromosome are fused with freshly prepared Arabidopsis protoplasts, 
for example, in a ratio of 10:1, microprotoplasts to protoplasts. Protoplasts 
from other plants, including but not limited to, tobacco, wheat, maize and rice, 
can also be used as the recipient of MACs and/or PACs. Fused protoplasts are 

25 recovered and allowed to grow for one or more generations. The presence of 
the transferred PACs can be analyzed using methods such as, for example, 
those described herein (including Southern hybridization with PAC probes, FISH 
analysis and PCR analysis using DNA sequences specific to the PAC). 
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Example 17 

Transfer of Artificial Chromosomes into Plant Cells: Transfer of 
Mammalian Artificial Chromosomes into a Second Dicot Plant: Tobacco 

MACs were introduced into tobacco cells using microcell-PEG mediated 

5 fusion using the same microcells, MAC, and protocol as described in Example 

16. Microcells were formed from murine cells containing an artificial 

chromosome and were fused with freshly prepared tobacco BY-2 protoplasts in 

a ratio of 10:1, microcells to protoplasts. Fusion occurred in the presence of 

20% PEG 4000 and 100-200 mM calcium chloride. Empirical observations are 

10 used to determine the optimal concentration and composition of PEG and the 

concentration of calcium that provides the highest degree of fusion with the 

least toxicity. 

DAPI staining of the microcells (e.g. by preincubation of the microcells 
with DAPI by adding DAPI to the microcells to a final concentration of 1 //g/ml) 

1 5 allowed visualization of the fusion and transfer of the chromosomes to the 
tobacco protoplasts. Fused protoplasts were recovered and allowed to grow for 
one or more generations. The fused protoplasts can be analyzed for the 
presence of a MAC in a number of ways, including those described herein. 
Fused tobacco cell nuclei were isolated from tobacco protoplasts that had been 

20 fused with microcells according to Example 1 1 and were subjected to FISH 
analysis according to Example 1 3, using the mouse major satellite DNA (SEQ 
ID No. 12). Numerous nuclei were found to have incorporated a mouse 
chromosome. 

Example 18 

25 Transfer of isolated Artificial Chromosomes by Lipid-Mediated Transfer 

into a Monocot Plant: Rice 

Isolated murine artificial chromosomes (MACs) prepared by sorting 

through a FACS apparatus {de Jong etal. Cytometry (1999) 35:129-133) were 

transferred into rice plant protoplasts by cationic lipid-mediated transfection of 

30 the purified MAC. Purified MACs (see Example 15 and U.S. Patent No. 

6,077,697) were mixed with LipofectAMINE 2000 (Gibco, Md, USA) as follows. 
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Typically, 15 jj\ of LipofectAMINE 2000 were added to 1 X 10 6 artificial 
chromosomes in liquid buffer, the solution allowed to complex for up to three 
hours, and then the solution was added to freshly prepared 1 X 10 s rice 
protoplasts prepared using standard protoplast methods well known in the art. 
5 The uptake of the lipid-complexed artificial chromosome was monitored by 
adding to the mixture of protoplasts and purified artificial chromosomes a 
fluorescent dye that stains DNA. Microscopic examination of the 
protoplast/artificial chromosome mixture over the next several hours allowed the 
visualization of the artificial chromosome being transported across the 

10 protoplast cellular membrane and the presence of the readily identifiable MAC 
in the cytoplasm of the rice plant cell. 

The same procedure as described in this Example for cationic lipid- 
mediated transfer of an isolated MAC into rice protoplasts can be used to 
transfer isolated MACs, as well as PACs, into rice and other plant protoplasts, 

15 including but not limited to, tobacco, wheat, maize and Arabidopsis. Fused 
protoplasts are recovered and allowed to grow for one or more generations. 
The presence of the transferred MACs and PACs can be analyzed using 
methods such as, for example, those described herein (including, but not limited 
to, Southern hybridization with PAC probes, FISH analysis and PCR analysis 

20 using DNA sequences specific to the PAC). 

Example 19 

Delivery of Plant Regulatory and Coding Sequences via a Promoterless attBZeo 
Marker Gene in pAg2 onto a MAC Platform 

As described in Examples 6-15, the plasmid pAg2, comprising plant 

25 regulatory and selectable marker genes (SEQ ID NO: 6; prepared as set forth in 

Example 5) can be used for the production of a MAC containing said plant 

expressible genes. In this example, pAg2, by virtue of the attBZeo DNA 

sequences contained on the plasmid, is used for the loading of plant regulatory 

and selectable marker genes onto MACs in mammalian cells using the attB 

30 sequences to recombine with attP sequences present on a platform MAC. In 
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this example, platform M ACs are produced with attP sequences and the plasmid 
pAg2 is then loaded onto the platform MAC. New MACs so produced are 
useful for introduction into plan cells by virtue of the plant expressible markers 
contained therein. 

5 A. Construction of Platform MAC containing pSV40attPsensePUR (Figure 
7; SEQ ID NO: 26). 

An example of a selectable marker system for the creation of a MAC- 

based platform into which the plasmid pAg2 can target plant regulatory and 

coding sequences is shown in Figure 7. This system includes a vector 

1 0 containing the SV40 early promoter immediately followed by ( 1 > a 282 base pair 

(bp) sequence containing the bacteriophage lambda attP site and (2) the 

puromycin resistance marker. Initially a Pvu\\IStu\ fragment containing the 

SV40 early promoter from plasmid pPUR (Clontech Laboratories, Inc., Palo Alto, 

CA; SEQ ID Mo. 22) was subcloned into the EcoR\iCR\ site of pNEB193 (a 

1 5 PUC19 derivative obtained from New England Biolabs, Beverly, MA; SEQ ID No. 

23) generating the plasmid pSV40193. 

The attP site was PCR amplified from lambda genome (GenBank 

Accession # NC 001416) using the following primers: 

attPUP: CCTTGCGCTAATGCTCTGTTACAGG SEQ ID No. 24 

20 attPDWN: CAGAGGCAGGGAGTGGGACAAAATTG SEQ ID No. 25 

After amplification and purification of the resulting fragment, the attP site 

was cloned into the Sma\ site of pSV401 93 and the orientation of the attPsite 

was determined by DNA sequence analysis (plasmid pSV401 93attP). The gene 

encoding puromycin resistance (Puro) was isolated by digesting the plasmid 

25 pPUR (Clontech Laboratories, Inc. Palo Alto, CA) w\thAge\/BamH\ followed by 

filling in the overhangs with Klenow and subsequently cloned into theAscl site 

downstream of the attP site of pSV40193attP generating the plasmid 

pSV40193attPsensePUR (Figure 7; SEQ ID NO:26)). 

The plasmid pSV40193attPsensePUR was digested with Seal and co- 

30 transfected with the plasmid pFK161 into mouse LMtk- cells and platform 
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artificial chromosomes were identified and isolated as described herein. Briefly, 
Puromycin resistant colonies were isolated and subsequently tested for artificial 
chromosome formation via fluorescent in situ hybridization (FISH) (using mouse 
major and minor DNA repeat sequences, the puromycin gene and telomeres 
5 sequences as probes), and their fluorescent activating cell sorted (FACS). From 
this sort, a subclone was isolated containing an artificial chromosome, 
designated B19-38. FISH analysis of the B19-38 subclone demonstrated the 
presence of telomeres and mouse minor on the MAC. DOT PCR has been done 
revealing the absence of uncharacterized euchromatic regions on the MAC. The 

10 process for generating this exemplary MAC platform containing multiple site- 
specific recombination sites is summarized in Figure 5. This MAC chromosome 
may subsequently be engineered to contain target g^ne expression nucleic acids 
using the lambda integrase mediated site-specific recombination system as 
described below. 

15 B. Construction of Targeting Vector. 

The construction of the targeting vector pAg2 is set forth in Example 5 

herein. 

C. Transfection of Promotorless Marker and Selection With Drug (See 
Figure 9). 

20 The mouse LMtk- cell line containing the MAC B19-38 (constructed as 

set forth above and also referred to as a 2 nd generation platform ACE), is plated 
onto four 1 0cm dishes at approximately 5 million cells per dish. The cells are 
incubated overnight in DMEM with 10% fetal calf serum at 37°C and 5% C0 2 . 
The following day the cells are transfected with 5//g of the vector pAg2 

25 (prepared as described in Example 5 above) and 5/yg of pCXLamlntR (encoding 
a lambda integrase having an E to R amino acid substitution at position 174), 
for a total of 10//g per 10cm dish. Lipofectamine Plus reagent is used to 
transfect the cells according to the manufacturers protocol. Two days post- 
transfection zeocin is added to the medium at 500ug/ml. The cells are 
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maintained in selective medium until colonies are formed. The colonies are then 

ring-cloned and genomic DNA is analyzed. 

D. Analysis Of Clones (PCR, SEQUENCING). 

Genomic DNA (including MACs) is isolated from each of the candidate 
5 clones with the Wizard kit (Promega) and following the manufacturers protocol. 

The following primer set is used to analyze the genomic DNA isolated from the 

zeocin resistant clones: 5PacSV40 - CTGTTAATTAACTGTGGAATGTGTG 

TCAGTTAGGGTG (SEQ ID NO: 28); Antisense Zeo - 

TGAACAGGGTCACGTCGTCC (SEQ ID NO: 29). PCR amplification using the 
10 above primers and genomic DNA, which included MACs, from the candidate 

clones results in a PCR product indicating the correct sequence for the desired 

site-specific integration event. 

The MACs containing the pAg2 vector are identified and used for transfer 

into plant (such as described in Examples 1 6 and 1 7) or animal cells for the 
15 expression of the desired coding sequences contained therein. The MACs 

containing pAg2 carry two plan selectable markers (hygromycin resistance, 

resistance to phosphinothricin) and a visual selectable marker (green fluorescent 

protein). 

Example 20 

20 Construction of Plant-derived Shuttle Artificial Chromosome. 

In another embodiment, the plant artificial chromosomes provided herein 
are useful as selectable shuttle vectors that are able to move one or more 
desired genes back and forth between plant and mammalian cells. In this 
particular embodiment, the plant artificial chromosome is bi-functional in that 
25 proper integration of donor nucleic acid can be selected for in both plant and 
mammalian cells. 

For example, a plant artificial chromosome is prepared as described in 
Examples 6-15 above using ing the plasmid pAg2 (Example 5; SEQ ID NO: 6) 
that has been modified to include the SV40attPsensePur coding region from the 
30 plasmid pSV401 93attPsensePur (described above in Example 1 9. A.). Thus, the 
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resulting plant-derived shuttle artificial chromosome contains DNA from the bar 
gene confering resistance to phosphinothricin in plant cells, DNA from the 
hygromycin resistance gene conferring resistance to hygromycin in plant cells, 
both resistance-encoding DNAs under the control of a separate cauliflower 
5 mosaic virus (CaMV) 35S promoter, the attB-promoterless zeomycin resistance- 
encoding DNA, and DNA conferring resistance to puromycin under the control 
of a mammalian SV40 promoter. Accordingly, the presence of the shuttle PAC 
in either a plant or mammalian cell can be selected for by treatment with, for 
example, either hygromycin (plant) or puromycin (mammalian). 

10 Because the resulting plant-derived shuttle artificial chromosome contains 

at least one SV40attP site therein similar to the platform MAC prepared in 
Example 19. A. above, a donor vector containing an attB-selectable marker 
sequence, such as a plasmid comprising an attBzeo (e.g. pAg2) can be used to 
selectively introduce desired heterologous nucleic acids from any species (such 

15 as plants, animals, insects and the like) into the shuttle artificial chromosome 
that is present in a mammalian cell. 

Likewise, a plant promoter region, such as CaMV35S, can be used to 
replace the SV40 promoter in the SV40attPPur region of the modified pAg2 
plasmid described above. In this embodiment, because the resulting plant- 

20 derived shuttle artificial chromosome contains at least one CaMV35SattP site 
therein analogous to the platform MAC prepared in Example 19. A. above, a 
donor vector containing an attB-selectable marker sequence, such as a plasmid 
having attBkanamycin, or other plant selectable or scorable marker can be used 
to selectively introduce desired heterologous nucleic acids from any species 

25 (such as plants, animals, insects and the like) into the shuttle artificial 
chromosome that is present in a plant cell. 

Since modifications will be apparent to those of skill in this art, it is 
intended that this invention be limited by only the scope of the appended 
claims. 
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What is Claimed: 

1. A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a cell comprising one or more plant 

chromosomes; and 

5 selecting a cell comprising an artificial chromosome that comprises 

one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
10 sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 

2. The method of claim 1, wherein the artificial chromosome is 
predominantly made up of one or more repeat regions. 

15 3. The method of claim 1, wherein the nucleic acid introduced into 

the cell comprises a nucleic acid sequence that facilitates amplification of a 
region of a plant chromosome or targets it to an amplifiable region of a plant 
chromosome. 

4. The method of claim 1 , wherein the nucleic acid introduced into 
20 the cell comprises one or more nucleic acids selected from the group consisting 

of rDNA, lambda phage DNA and satellite DNA. 

5. The method of claim 4, wherein the nucleic acid comprises plant 

rDNA. 

6. The method of claim 5, wherein the rDNA is from a plant selected 
25 from the group consisting of Arabidopsis, Nicotiana, Solarium, Lycopersfcon, 

Daucus, Hordeum, Zea mays, Brassica, Triticum and Oryza. 

7. The method of claim 4, wherein the nucleic acid comprises animal 

rDNA. 

8. The method of claim 7, wherein the rDNA is mammalian rDNA. 
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9. The method of claim 4, wherein the nucleic acid comprises rDNA 
comprising sequence of an intergenic spacer region. 

1 0. The method of claim 9, wherein the intergenic spacer region is 
from DIMA from a plant selected from the group consisting of Arabidopsis, 

5 Solanum, Lycopersicon , Hordeum, Zea, Oryza, rye, wheat, radish and mung 
bean. 

1 1 . The method of claim 1 , wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates identification of cells 
containing the nucleic acid. 

10 12. The method of claim 11, wherein the nucleic acid sequence 

encodes a fluorescent protein. 

1 3. The method of claim 1 2, wherein the protein is a green fluorescent 
protein. 

14. The method of claim 1, wherein the step of selecting a cell 
15 comprising an artificial chromosome comprises sorting of cells into which 

nucleic acid was introduced. 

15. The method of claim 1, wherein the step of selecting a cell 
comprising an artificial chromosome comprises fluorescent in situ hybridization 
(FISH) analysis of cells into which nucleic acid was introduced. 

20 16. The method of claim 1, wherein the one or more plant 

chromosomes contained in the cell is {are) selected from the group consisting 
of Arabidopsis, tobacco and Helianthus cells. 

17. The method of claim 16, wherein the cell is a plant protoplast. 

18. The method of claim 1, wherein the nucleic acid introduced into 
25 the cell comprises nucleic acid encoding a selectable marker. 

19. The method of claim 18, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, kanamycin, 
hygromycin, dihydrofolate or sulfonylurea. 

20. A isolated plant artificial chromosome comprising one or more 
30 repeat regions, wherein: 
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one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
5 euchromatic and heterochromatic nucleic acid. 

21 . The plant artificial chromosome of claim 20, wherein the artificial 
chromosome is predominantly made up of one or more repeat regions. 

22. A plant cell comprising an artificial chromosome, wherein the 
artificial chromosome is produced by the method of claim 1 or claim 2. 

10 23 . A method of producing a transgenic plant, comprising introducing 

the artificial chromosome of claim 20 or claim 21 into a plant cell. 

24. The method of claim 23, wherein the artificial chromosome 
comprises heterologous nucleic acid encoding a gene product. 

25. The method of claim 24, wherein the heterologous nucleic acid 
15 encodes a product selected from the group consisting of enzymes, antisense 

RIMA, tRNA, rDNA, structural proteins, marker proteins, ligands, receptors, 
ribozymes, therapeutic proteins and biopharmaceutical proteins. 

26. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product selected from the group consisting of vaccines, blood 

20 factors, antigens, hormones, cytokines, growth factors and antibodies. 

27. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that provides for resistance to diseases, insects, herbicides 
or stress in the plant. 

28. The method of claim 24, wherein the heterologous nucleic acid 
25 encodes a product that provides for an agronomically important trait in the 

plant. 

29. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that alters the nutrient utilization and/or improves the 
nutrient quality of the plant. 
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30. The method of claim 24, wherein the heterologous nucleic acid is 
contained within a bacterial artificial chromosome (BAC) or a yeast artificial 
chromosome {YAQ. 

31. A method of identifying plant genes encoding particular traits, 
5 comprising: 

generating an artificial chromosome comprising euchromatic DNA 
from a first species of plant; 

introducing the artificial chromosome into a plant cell of a second 
species of plant; and 
10 detecting phenotypic changes in the plant cell comprising the 

artificial chromosome and/or a plant generated from the plant cell comprising 
the artificial chromosome. 

32. The method of claim 31, wherein the artificial chromosome is a 
plant artificial chromosome or a mammalian artificial chromosome. 

15 33. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a cell comprising one or more plant 
chromosomes; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
25 34. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a plant cell; and 
selecting a plant cell comprising a SATAC. 
35. The method of claim 31, wherein the artificial chromosome is a 
30 minichromosome produced by a method comprising: 
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introducing nucleic acid into a plant cell; and 
selecting a cell comprising a minichromosome comprising a neo- 
centomere and euchromatin. 

36. The method of any of claims 33-35, wherein the nucleic acid 
5 introduced into the plant cell comprises DNA encoding a selectable marker. 

37. The method of claim 36, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, kanamycin, 
hygromycin, dihydrofolate or sulfonylurea. 

38. The method of claim 31, wherein the artificial chromosome 
10 comprising euchromatic DNA from a first plant species is produced by a method 

comprising: 

introducing into a plant cell of a first plant species an artificial 
chromosome capable of undergoing homologous recombination with the DNA 
of the first plant species; 
15 selecting for a recombination event between the artificial chromosome 

and the DNA of the first plant species; and 

selecting an artificial chromosome comprising euchromatic DNA from the 
first plant species. 

39. The method of claim 31, wherein the artificial chromosome 
20 comprising euchromatic DNA from a first plant species is produced by a method 

comprising: 

introducing into a plant cell of a first species an artificial chromosome 
capable of undergoing site-specific recombination with the DNA of the first plant 
species; 

25 selecting for a site-specific recombination event between the artificial 

chromosome and the DNA of the first plant species, and 

selecting an artificial chromosome comprising euchromatic DNA from the 
first plant species. 

40. The method of claim 39, wherein the DNA of the plant cell of a 
30 first species is modified to comprise a site-specific recombination sequence. 
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41. The method of claim 39, wherein the artificial chromosome 
comprises a site-specific recombination sequence. 

42. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence and 

5 the artificial chromosome comprises a site-specific recombination sequence. 

43. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence and 
the artificial chromosome comprises a site-specific recombination sequence that 
is complementary to the site-specific recombination sequence of the plant cell 

10 of a first plant species. 

44. The method of claim 39, wherein the site-specific recombination 
is catalyzed by a recombinase enzyme. 

45. A method for producing an acrocentric plant chromosome, 
comprising: 

15 introducing a first nucleic acid comprising a site-specific 

recombination site into a first chromosome of a plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into a second chromosome of the plant cell; 

introducing a recombinase activity into the plant cell, wherein the 
20 activity catalyzes recombination between the first and second chromosomes 
and whereby an acrocentric plant chromosome is produced. 

46. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome. 

47. The method of claim 45, wherein the second nucleic acid is 
25 introduced into the distal end of the arm of the second chromosome. 

48. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome and the 
second nucleic acid is introduced into the distal end of the arm of the second 
chromosome. 
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49. A method for producing an acrocentric plant chromosome, 
comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site Into the pericentric heterochromatin of a chromosome in a 
5 plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into the distal end of the chromosome, wherein the first and 
second recombination sites are located on the same arm of the chromosome; 

introducing a recombinase activity into the cell, wherein the 
10 activity catalyzes recombination between the first and second recombination 
sites in the chromosome and whereby an acrocentric plant chromosome is 
produced. 

50. A method for producing an acrocentric plant chromosome, 
comprising: 

15 introducing nucleic acid comprising a recombination site adjacent 

to nucleic acid encoding a selectable marker into a first plant cell; 

generating a first transgenic plant from the first plant cell; 
introducing nucleic acid comprising a promoter functional in a plant 
cell, a recombination site and a recombinase coding region in operative linkage 
20 into a second plant cell; 

generating a second transgenic plant from the second plant cell; 
crossing the first and second plants; 

obtaining plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker; and 
25 selecting a resistant plant that contains cells comprising an 

acrocentric plant chromosome. 

5 1 . The method of any of claims 45-50, wherein the DNA of the short 
arm of the acrocentric chromosome contains less than 5% euchromatic DNA. 

52. The method of any of claims 45-50, wherein the DNA of the short 
30 arm of the acrocentric chromosome contains less than 1 % euchromatic DNA. 
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53. The method of any of claims 45-50, wherein the short arm of the 
acrocentric chromosome does not contain euchromatic DNA. 

54. The method of any of claims 45-49, wherein the nucleic acid 
introduced into a chromosome comprises nucleic acid encoding a selectable 

5 marker. 

55. An acrocentric plant artificial chromosome, wherein the short arm 
of the acrocentric chromosome does not contain euchromatic DNA. 

56. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant acrocentric chromosome in a 

10 cell, wherein the short arm of the acrocentric chromosome does not contain 
euchromatic DNA; 

culturing the cell through at least one cell division; and 
selecting a cell comprising an artificial chromosome, is 
predominantly heterochromatic. 
15 57. The method of claim 56, wherein the acrocentric chromosome is 

produced by the method of any of claims 45-49. 

58. A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

25 the common nucleic acid sequences comprise sequences that 

represent euchromatic and heterochromatic nucleic acid. 

59. The method of claim 4, wherein the nucleic acid comprises plant 
rDNA from a dicot plant species. 

60. The method of claim 4, wherein the nucleic acid comprises plant 
30 rDNA from a monocot plant species. 
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61. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a Nicotiana plant. 

62. The method of claim 9, wherein the rDNA is plant rDNA. 

63. The method of claim 62, wherein the plant is a dicot plant species. 
5 64. The method of claim 62, wherein the plant is a monocot plant 

species. 

65. The method of claim 1, wherein the cell is a dicot plant cell. 

66. The method of claim 1 , wherein the cell is a monocot plant cell. 

67. An isolated plant artificial chromosome comprising one or more 
10 repeat regions, wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that 
15 represent euchromatic and heterochromatic nucleic acid. 

68. The method of claim 31, wherein the artificial chromosome is 
produced by a method comprising: 

introducing nucleic acid into a plant cell; and 
selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that represent 
euchromatic and heterochromatic nucleic acid. 
25 69. The method of claim 44, wherein the recombinase is selected from 

the group consisting of a bacteriophage P1 Cre recombinase, a yeast R 
recombinase and a yeast FLP recombinase. 

70. The method of claim 50, further comprising selecting first and 
second transgenic plants wherein: 
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one of the plants comprises a chromosome comprising a 
recombination site located on a short arm of the chromosome in a region 
adjacent to the pericentric heterochromatin; and 

the other plant comprises a chromosome comprising a 
5 recombination site located in rDNA of the chromosome. 

71 . The method of claim 70, wherein the recombination sites on the 
two chromosomes are in the same orientation. 

72. A method for producing an acrocentric plant chromosome, 
comprising: 

10 introducing nucleic acid comprising two site-specific recombination 

sites into a cell comprising one or more plant chromosomes; 

introducing a recombinase activity into the cell, wherein the 

activity catalyzes recombination between the two recombination sites, whereby 

a plant acrocentric chromosome is produced. 
15 73. The method of claim 72, wherein the two site-specific 

recombination sites are contained on separate nucleic acid fragments. 

74. The method of claim 73, wherein the separate nucleic acid 
fragments are introduced into the cell simultaneously or sequentially. 

75. The method of claim 56, wherein the artificial chromosome is 
20 predominantly heterochromatic. 

76. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant chromosome in a cell, wherein 

the chromosome contains adjacent regions of rDNA and heterochromatic DNA; 
culturing the cell through at least one cell division; and 
25 selecting a cell comprising an artificial chromosome. 

77. The method of claim 76, wherein the artificial chromosome is 
predominantly heterochromatic. 

78. The method of claim 76 or claim 77, wherein the plant 
chromosome into which the nucleic acid is introduced is an acrocentric 

30 chromosome. 
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79. The method of claim 78, wherein the short arm of the 
chromosome contains adjacent regions of rDNA and heterochromatic DNA. 

80. The method of any of claims 76-79, wherein the heterochromatic 
DNA is pericentric heterochromatin. 

5 81 . A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
animal cells in the presence of an agent normally toxic to the animal cells; and 
wherein the agent is not toxic to plant cells; 
10 a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a region 
of a plant chromosome or targets the vector to an amplif iable region of a plant 
chromosome. 

82. The vector of claim 81 , wherein the amplif iable region comprises 
15 heterochromatic nucleic acid. 

83. The vector of claim 81 , wherein the amplif iable region comprises 

rDNA. 

84. The vector of claim 81 , wherein the sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the vector 

20 to an amplifiable region of a plant chromosome comprises a sufficient portion 
of an intergenic spacer region of rDNA to facilitate amplification or effect the 
targeting. 

85. The vector of claim 84, wherein the sufficient portion contains at 
least 14, 20, 30, 50, 100, 150, 300 or 500 contiguous nucleotides from an 

25 intergenic spacer region. 

86. The vector of claim 81 , wherein the selectable marker encodes a 
product that confers resistance to zeomycin. 

88. The vector of claim 81 , wherein the recognition site comprises an 
att site. 

30 89. The vector claim 81, that is pAglla or pAgllb. 
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90. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
animal cells in the presence of an agent normally toxic to the animal cells; and 
5 wherein the agent is not toxic to plant cells; 

a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 

9 1 . The vector of claim 90, wherein the recognition site comprises an 
att site. 

10 92. The vector of claim 90, further comprising a sequence of 

nucleotides that facilitates amplification of a region of a plant chromosome or 
targets the vector to an amplifiable region of a plant chromosome. 

93. The vector of claim 90, wherein the promoter is nopaline synthase 
(NOS) or CaMV35S. 

15 94. The vector of claim 93 that is pAg1 or pAg 2. 

95. The vector of claim 92, wherein the amplifiable region comprises 
heterochromatic nucleic acid. 

96. The vector of claim 92, wherein the amplifiable region comprises 

rDNA. 

20 97. The vector of claim 96, wherein the sequence of nucleotides that 

facilitates amplification of a region of a plant chromosome or targets the vector 

K 

to an amplifiable region of a plant chromosome comprises a sufficient portion 
of an intergenic spacer region of rDNA to effect the amplification or the 
targeting. 

25 98. The vector of claim 90, wherein the protein is a selectable marker 

that permits growth of plant cells in the presence of an agent normally toxic to 
the plant cells. 

99. The vector of claim 98, wherein the selectable marker confers 
resistance to hygromycin or to phosphothricin. 
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100. The vector of claim 90, wherein the protein is a fluorescent 
protein. 

101. The vector of claim 90, wherein the fluorescent protein is selected 
from the group consisting of green, blue and red fluorescent proteins. 

5 102. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
plant cells in the presence of an agent normally toxic to the plant cells; and 
wherein the agent is not toxic to animal cells; 
10 a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 
103. A vector, comprising: 

a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a region 
15 of a plant chromosome or targets the vector to an amplif iable region of a plant 
chromosome, wherein the plant is selected from the group consisting of 
Arabidopsis, Nicotiana, Solanum, Lycopersicon , Daucus, Hordeum, Zea mays, 
Brassica, Triticum, Hefianthus, Glycine, soybean, Gossypium, cotton, 
Helianthus, sunflower and Oryza. 
20 104. The vector of claim 103, wherein the recognition site comprises 

an att site. 

105. A cell, comprising a vector of any of claims 81-104. 

106. The cell of claim 105 that is a plant cell. 
25 107. A method, comprising: 

introducing a vector of claim 90 into a cell, wherein: 
the cell comprises an animal platform ACes that contains a recognition site that 
recombines with the recognition site in the vector in the presences of the 
recombinase therefor, thereby incorporating the selectable marker that is not 
30 operably associated with any promoter and the nucleic acid encoding a protein 
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operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. 

1 08. The method of claim 1 07, wherein the recombination sites are att 

sites. 

5 109. The method of claim 107, wherein the animal is a mammal. 

110. The method of claim 107, wherein the platform ACes comprises 
a promoter that upon recombination is operably linked to the selectable marker 
that in the vector is not operably associated with a promoter, 

111. The method of any of claims 107-110, further comprising, 
10 transferring the resulting platform ACes into a plant cell to produce a plant cell 

the compriese the platform Aces. 

1 1 2. The method of claim 111, wherein the resulting platform ACes is 
isolated prior to transfer. 

1 1 3. The method of claim 111, wherein the isolated ACes is introduced 
15 into a plant cell by a method selected from the group consisting of protoplast 

transfection, lipid-mediated delivery, liposomes, electroporation, sonoporation, 
microinjection, particle bombardment, silicon carbide whisker-mediated 
transformation, polyethylene glycol (PEG)-mediated DNA uptake, lipofection and 
lipid-mediated carrier systems. 
20 114. The method of claim 111, wherein the resulting platform ACes is 

transferred by fusion of the cells. 

115. The method of claim 111, wherein the cells are plant protoplasts. 

1 16. The method of any of claim 107, wherein the cell is an animal 

cell. 

25 117. The method of claim 116, wherein the animal cell is a mammalian 

cell. 

118. The method of claim 111, further comprising culturing the plant 
cell that comprises the platform Aces under conditions whereby the protein 
encoded by the nucleic acid that is operably linked to a plant promoter is 
30 expressed. 
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119. A method, comprising: 

introducing a vector of claim 81 into a plant cell; 
culturing the plant cells; and 

selecting a plant cell comprising an artificial chromosome that comprises 
5 one or more repeat regions. 

1 20. The method of claim 1 1 9, wherein sufficient portion of the vector 
integrates into a chromosome in the plant cell to result in amplification of 
chromosomal DNA. 

1 21 . The method of claim 1 1 9 or claim 1 20, wherein: 

10 one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
15 1 22. The method of claim 119, further comprising isolating the artificial 

chromosome. 

123. A method, comprising: 

introducing a vector into a cell, wherein: 
i) the vector comprises: 
20 a) nucleic acid encoding a selectable marker that is 

not operably associated with any promoter, wherein the selectable 
marker permits growth of animal cells in the presence of an agent 
normally toxic to the animal cells; and wherein the agent is not 
toxic to plant cells; 
25 b) a recognition site for recombination; and 

c) nucleic acid encoding a protein operably linked to 
an animal promoter; 

ii} the cell comprises: 

a platform plant artifical chromosome (PAC) that comprises 
30 a recombination site and an animal promoter that upon 
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recombination is operably linked to the selectable marker that, in 
the vector, is not operably associated with a promoter; 

iii) introduction is effected under conditions whereby the 
vector recombines with the PAC to produce a plant platform PAC that contains 
5 the selectable marker operably linked to the promoter; and 

culturing the resulting cell under conditions, whereby the protein encoded 
by nucleic acid operably linked to an animal promoter is expressed. 

1 24. The method of claim 119, wherein the artificial chromosome is an 

ACes. 

10 125. The method of claim 123, wherein the plant platform PAC is an 

ACes. 

126. The method of claim 1, wherein the nucleic acid introduced into 
the cell comprises nucleic acid encoding a selectable marker. 

1 27. The vector of claim 81 , further comprising one or more selectable 
15 markers that when expressed in the plant cell permit the selection of the cell. 

128. A plant transformation vector, comprising: 
a recognition site for recombination; 

a sequence of nucleotides that facilitates amplification of a region 
of a plant chromosome or targets the vector to an amplif iable region of a plant 
20 chromosome; and 

one or more selectable markers that when expressed in a plant cell 
permit the selection of the cell; wherein 

the plant transformation vector is for Agrobacterium-med\ated 
transformation of plants. 
25 1 29. A method of producing a plant artificial chromosome, comprising: 

introducing the vector of any of claims 81 , 1 27 and 1 28 into a cell 
comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that comprises 
one or more repeat regions; wherein 
30 one or more nucleic acid units is (are) repeated in a repeat region; 
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repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 
5 1 30. A method of producing a plant artificial chromosome, comprising: 

introducing the vector of any of claims 81, 127 and 1 28 into a cell 
comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that comprises 
one or more repeat regions; wherein 
10 one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
15 131. The method of claim 1 23, wherein the cell into which the vector 

is introduced is an animal cell. 

132. The method of claim 131, wherein the cell is a mammalian cell. 
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Fig. 5 Construction of pAglla and pAgllb 



— a — E - =3 E _ — _ _ 

#$0-ffl2ffl Saul S s-a-afiasasaas 



pGEM-TEasy 



NosR-.i 



FBI -3 



Not V^e I 



- = _ = E - 

o a a o Q 



— T - 



p Sue Script 



<59 



o O ~2 r Q c 



NosR-J 



GUS 



Ner 



■ * 1 < • ■ 



pNGN-l 



pGB/l-TEasy 



_ _ =» cc 

(D « £ ^ O ^ — o O 



ftU , ( OS * 



MSQXFCR 
Skc D Rag. 



Sac II 



p!GS-l 



pGBYl-TEasy 



co Z 2 ^ o ^ 0 • 

§■5^33 £ S3 S 

_l l l I j J 1 I 



u a 



at 



o> *o ~ ^ — .£ o ^ — 



pNGNrl , 



Not I 



pMlGS-l 



o o 



° ° "S ° c w w 



go~ o o o . a>£ E « 



NosR.[ 



GUS 



<S5 



Ner 



pBueSfcript 



§ae I and ne-ligate 
ir to flip NosR. 



pNGN-2 



u u « o*c 



= E 

u o 



at 



GJ (0 



o o •» o a) E ™ 



A/.MGS ► iNosft H 



GUS 



pBueSbript 



pAg-i 



Hnd i 



E&-q_ - - 
OOCffll oc 



Ner 



pNGN-3 



pAglla and pAgllb 



WO 02/096923 



6/9 



PCT/US02/17451 



CL 



Q. 



o 

(D 
(0 

o 

Q. 

(9 



o 

■*-» 

o 



C/) 
CD 

a: 



< 

CO 

*wmm 

LL. 



111 P^H 
/^l ocg 
fed oqg 



1 a* 
fed oag 

W ocg- 



II 



Ad ocg 

fed OQ3 * 
II o^S- 



PUH- 



III PUH ^ 



II 0*8' 

U ocg- 



I a*H 
I eqx-l 



I OMX 
y ocg- 

II o^S 
HON 

y ocg 

I a*- 
H weg - 
I eas 



id ocg 
/\y oca -i 
hi puirt 



a 

-Q 

8 

CN 



CL 
-Q 

in 

CO 



a. 

O 



Q. 

.a . 

8 

CO 



CL 

8 

CN 



a 

CD 

CD 
CO 

c 



CL 
CO 



c 
o 

• MM 

+■» 

o 



CO 
CD 

a: 



CD 

cb 



WO 02/096923 



7/9 



PCT/US02/17451 




WO 02/096923 PCT/US02/17451 

8/9 




WO 02/096923 



9/9 



PCT/US02/17451 



A 




WO 02/096923 



PCT/US02/17451 



-1- 



SEQUENCE LISTING 

<110> CHROMOS MOLECUIiAR SYSTEMS, INC. 
Perez, Carl 
Fabi j anski , S t even 
Perkins , Edward 

<12 0> Plant Artificial Chromosomes, Uses thereof, and Methods of Preparing 
Plant Artificial Chromosomes 

<130> 24601-419PC 

<140> Not Yet Assigned 
<141> Herewith 

<150> US 60/294,687 
<151> 2001-05-30 

<150> US 60/296,329 
<151> 2001-06-04 

<160> 51 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 11182 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAgl plasmid 
<400> 1 

catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc ctccgctgct 60 
atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca 120 
agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt 180 
gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac gccatgaaca 240 
agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac caggacttga 300 
ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc gagaagatca 360 
ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta cgccctggcg 420 
acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac ctactggaca 480 
ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg 540 
acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg 600 
agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg 660 
tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc cgcgagctga 72 0 
tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg catcgctcga 780 
ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg 840 
gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac 900 
gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt ttttcattac 960 
cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt 102 0 
ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc tggcggcctg 1080 
gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt gatgtgtatt 1140 
tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag taaataaaca 1200 
aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg cgggtcaggc 1260 
aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc cgatgttctg 132 0 
ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa 1380 
ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa ggccatcggc 1440 
cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg 1500 
atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga catatgggcc 1560 
accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg aaggctacaa 162 0 
gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag 1680 
gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg cgtgagctac 1740 
ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc 1800 
cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt taatgaggta 1860 
aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc gcacgcagca 192 0 
gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg gtcaactttc 1980 
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agttgccggc ggaggatcac accaagctga agatgtacgc ggtacgccaa ggcaagacca 2040 
ttaccgagct gctatctgaa tacatcgcgc agctaccaga gtaaatgagc aaatgaataa 2100 
atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga acaaccaggc 2160 
accgacgccg tggaatgccc catgtgtgga ggaacgggcg gttggccagg cgtaagcggc 2220 
tgggttgtct gccggccctg caatggcact ggaaccccca agcccgagga atcggcgtga 2280 
cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg ctgggtgatg acctggtgga 2340 
gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc gaggcagaag cacgccccgg 24 00 
tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa tcccggcaac cgccggcagc 2460 
cggtgcgccg tcgattagga agccgcccaa gggcgacgag caaccagatt ttttcgttcc 2520 
gatgctctat gacgtgggca cccgcgatag tcgcagcatc atggacgtgg ccgttttccg 2580 
tctgtcgaag cgtgaccgac gagctggcga ggtgatccgc tacgagcttc cagacgggca 2640 
cgtagaggtt tccgcagggc cggccggcat ggccagtgtg tgggattacg acctggtact 2700 
gatggcggtt tcccatctaa ccgaatccat gaaccgatac cgggaaggga agggagacaa 2760 
gcccggccgc gtgttccgtc cacacgttgc ggacgtactc aagttctgcc ggcgagccga 2 820 
tggcggaaag cagaaagacg acctggtaga aacctgcatt cggttaaaca ccacgcacgt 2880 
tgccatgcag cgtacgaaga aggccaagaa cggccgcctg gtgacggtat ccgagggtga 2940 
agccttgatt agccgctaca agatcgtaaa gagcgaaacc gggcggccgg agtacatcga 3 000 
gatcgagcta gctgattgga tgtaccgcga gatcacagaa ggcaagaacc cggacgtgct 3 060 
gacggttcac cccgattact ttttgatcga tcccggcatc ggccgttttc tctaccgcct 3120 
ggcacgccgc gccgcaggca aggcagaagc cagatggttg ttcaagacga tctacgaacg 3180 
cagtggcagc gccggagagt tcaagaagtt ctgtttcacc gtgcgcaagc tgatcgggtc 3240 
aaatgacctg ccggagtacg atttgaagga ggaggcgggg caggctggcc cgatcctagt 3300 
catgcgctac cgcaacctga tcgagggcga agcatccgcc ggttcctaat gtacggagca 33 60 
gatgctaggg caaattgccc tagcagggga aaaaggtcga aaaggtctct ttcctgtgga 3420 
tagcacgtac attgggaacc caaagccgta cattgggaac cggaacccgt acattgggaa 3480 
cccaaagccg tacattggga accggtcaca catgtaagtg actgatataa aagagaaaaa 3540 
aggcgatttt tccgcctaaa actctttaaa acttattaaa actcttaaaa cccgcctggc 3 600 
ctgtgcataa ctgtctggcc agcgcacagc cgaagagctg caaaaagcgc ctacccttcg 3 660 
gtcgctgcgc tccctacgcc ccgccgcttc gcgtcggcct atcgcggccg ctggccgctc 3720 
aaaaatggct ggcctacggc caggcaatct accagggcgc ggacaagccg cgccgtcgcc 3780 
actcgaccgc cggcgcccac atcaaggcac cctgcctcgc gcgtttcggt gatgacggtg 3 840 
aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg 3 900 
ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg ggcgcagcca 3 960 
tgacccagtc acgtagcgat agcggagtgt atactggctt aactatgcgg catcagagca 4020 
gattgtactg agagtgcacc atatgcggtg tgaaataccg cacagatgcg taaggagaaa 4 080 
ataccgcatc aggcgctctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 4140 
gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg 4200 
ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 4260 
ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 4320 
acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 4380 
tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 4440 
ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 4500 
ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 4560 
ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 4620 
actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 4680 
gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg gtatctgcgc 4740 
tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 4 800 
caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 4860 
atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 492 0 
acgttaaggg attttggtca tgcattctag gtactaaaac aattcatcca gtaaaatata 4 980 
atattttatt ttctcccaat caggcttgat ccccagtaag tcaaaaaata gctcgacata 5040 
ctgttcttcc ccgatatcct ccctgatcga ccggacgcag aaggcaatgt cataccactt 5100 
gtccgccctg ccgcttctcc caagatcaat aaagccactt actttgccat ctttcacaaa 5160 
gatgttgctg tctcccaggt cgccgtggga aaagacaagt tcctcttcgg gctttfcccgt 5220 
ctttaaaaaa tcatacagct cgcgcggatc tttaaatgga gtgtcttctt cccagttttc 5280 
gcaatccaca tcggccagat cgttattcag taagtaatcc aattcggcta agcggctgtc 5340 
taagctattc gtatagggac aatccgatat gtcgatggag tgaaagagcc tgatgcactc 5400 
cgcatacagc tcgataatct tttcagggct ttgttcatct tcatactctt ccgagcaaag 5460 
gacgccatcg gcctcactca tgagcagatt gctccagcca tcatgccgtt caaagtgcag 5520 
gacctttgga acaggcagct ttccttccag ccatagcatc atgtcctttt cccgttccac 5580 
atcataggtg gtccctttat accggctgtc cgtcattttt aaatataggt tttcattttc 5640 
tcccaccagc ttatatacct tagcaggaga cattccttcc gtatctttta cgcagcggta 5700 
tttttcgatc agttttttca attccggtga tattctcatt ttagccattt attatttcct 5760 
tcctcttttc tacagtattt aaagataccc caagaagcta attataacaa gacgaactcc 5 820 
aattcactgt tccttgcatt ctaaaacctt aaataccaga aaacagcttt ttcaaagttg 5880 
ttttcaaagt tggcgtataa catagtatcg acggagccga ttttgaaacc gcggtgatca 5940 
caggcagcaa cgctctgtca tcgttacaat caacatgcta ccctccgcga gatcatccgt 6000 
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gtttcaaacc 
tctgccgcct 
cgagtggtga 
tatattgtgg 
taatgtactg 
gttttaggaa 
ggtttcttat 
ggaactactc 
ggacggggcg 
ccgtgcttga 
atgcgcacgc 
gcctccaggg 
cggggggaga 
gggcccgcgt 
cgctcccgca 
aagttgaccg 
gcctcggtgg 
gagatagatt 
ttccttatat 
agtggagata 
cacgatgctc 
aacgatagcc 
tgtccttttg 
taccctttgt 
cttggagtag 
agacgtggtt 
gggaccactg 
tttgtaggtg 
atggaatccg 
gtcttctgag 
gttggcaagc 
taatgcagct 
aatgtgagtt 
atgttgtgtg 
tacgaattcg 
gagtttggac 
gatgctattg 
gaactccagc 
tccgaagccc 
gtcctgctcc 
ccgcccccac 
cgtggacacg 
ggccagggtg 
gtcccggacc 
ggtccagaac 
caacttggcc 
gcaggaattc 
accaaagggc 
attgcccagc 
aatgccatca 
ccaaagatgg 
cttcaaagca 
agaatatcaa 
taatatcggg 
cagtagaaaa 
ttcaagatgc 
tggaaaaaga 
ctgacgtaag 
aagttcattt 
tctctcgagc 
cgacgtctgt 
tctcggaggg 
tgcgggtaaa 
catcggccgc 
cctattgcat 
tgcccgctgt 
gccagacgag 



cggcagctta 
tacaacggct 
ttttgtgccg 
tgtaaacaaa 
aattaacgcc 
ttagaaattt 
atgctcaaca 
acacattatt 
gtaccggcag 
agccggccgc 
tcgggtcgtt 
acttcagcag 
cgtacacggt 
aggcgatgcc 
gacggacgag 
tgcttgtctc 
cacggcggat 
tgtagagaga 
agaggaaggt 
tcacatcaat 
ctcgtgggtg 
tttcctttat 
atgaagtgac 
tgaaaagtct 
acgagagtgt 
ggaacgtctt 
tcggcagagg 
ccaccttcct 
aggaggtttc 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaattgtgag 
agccttgact 
aaaccacaac 
ctttatttgt 
atgagatccc 
aacctttcat 
tcggccacga 
ggctgctcgc 
acctccgacc 
ttgtccggca 
acaccggcga 
tcgaccgctc 
atggatccag 
gatcgacact 
tattgagact 
tatctgtcac 
ttgcgataaa 
acccccaccc 
agtggattga 
agafcacagtc 
aaacctcctc 
ggaaggtggc 
ctctgccgac 
agacgttcca 
ggatgacgca 
catttggaga 
tttcgcagat 
cgagaagttt 
cgaagaatct 
tagctgcgcc 
gctcccgatt 
ctcccgccgt 
tctacaaccg 
cgggttcggc 



gttgccgttc 
ctcccgctga 
agctgccggt 
ttgacgctta 
gaattaattc 
tattgataga 
catgagcgaa 
atggagaaac 
gctgaagtcc 
ccgcagcatg 
gggcagcccg 
gtgggtgtag 
cgactcggcc 
ggcgacctcg 
gtcgtccgtc 
gatgtagtgg 
gtcggccggg 
gactggtgat 
cttgcgaagg 
ccacttgctt 
ggggtccatc 
cgcaatgatg 
agatagctgg 
caatagccct 
cgtgctccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatattac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
cggataacaa 
agagggtcga 
tagaatgcag 
aaccattata 
cgcgctggag 
agaaggcggc 
agtgcacgca 
cgatctcggt 
actcggcgta 
ccacctggtc 
agtcgtcctc 
cggcgacgtc 
atttcgctca 
ctcgtctact 
tttcaacaaa 
ttcatcaaaa 
ggaaaggcta 
acgaggagca 
tgtgataaca 
tcagaagacc 
ggattccatt 
acctacaaat 
agtggtccca 
accacgtctt 
caatcccact 
ggacacgctg 
ccgggggggc 
ctgatcgaaa 
cgtgctttca 
gatggtttct 
ccggaagtgc 
gcacagggtg 
gtcgcggagg 
ccattcggac 



ttccgaatag 
cgccgtcccg 
cggggagctg 
gacaacttaa 

gggggatctg 

agtattttac 
accctatagg 
tcgagtcaaa 
agctgccaga 
ccgcgggggg 
atgacagcga 
agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
ttgacgatgg 
cgtcgttctg 
ttcagcgtgt 
atagtgggat 
tgaagacgtg 
tttgggacca 
gcatttgtag 
gcaatggaat 
ttggtcttct 
catgttatca 
gatgctcctc 
gatagccttt 
ccttttgatg 
cctttgttga 
ggagtagacg 
ccgcctctcc 
tggaaagcgg 
caggctttac 
tttcacacag 
cggtatacag 
tgaaaaaaat 
agctgcaata 
gatcatccag 

ggtggaatcg 

gttgccggcc 
catggccggc 
cagctcgtcc 
ctggaccgcg 
cacgaagtcc 
gcgcgcggtg 
agttagtata 
ccaagaatat 
gggtaatatc 
ggacagtaga 
tcgttcaaga 
tcgtggaaaa 
tggtggagca 
aaagggctat 
gcccagctat 
gccatcattg 
aagatggacc 
caaagcaagt 
atccttcgca 
aaatcaccag 
aatgagatat 
agttcgacag 
gcttcgatgt 
acaaagatcg 
ttgacattgg 
tcacgttgca 
ctatggatgc 
cgcaaggaat 



catcggtaac 
gactgatggg 
ttggctggct 
taacacattg 
gattttagta 
aaatacaaat 
aaccctaatt 
tctcggtgac 
aacccacgtc 
catatccgag 
ccacgctctt 
ccagtcccgt 
aggcgttgcg 
cggcgacgag 
gttcctgcgg 
tgcagaccgc 
ggctcatggt 
cctctccaaa 
tgtgcgtcat 
gttggaacgt 
ctgtcggcag 
gtgccacctt 
ccgaggaggt 
gagactgtat 
catcaatcca 
gtgggtgggg 
cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatgct 
gaaacagcta 
acatgataag 
gctttatttg 
aacaagttgg 
ccggcgtccc 
aaatctcgta 
gggtcgcgca 
ccggaggcgt 
aggccgcgca 
ctgatgaaca 
cgggagaacc 
agcaccggaa 
aaaaagcagg 
caaagataca 
gggaaacctc 
aaaggaaggt 
tgcctctgcc 
agaagacgtt 
cgacactctc 
tgagactfcfct 
ctgtcacttc 
cgataaagga 
cccacccacg 
ggattgatgt 
agaccttcct 
tctctctcta 
gaaaaagcct 
cgtctccgac 
aggagggcgt 
ttatgtttat 
ggagtttagc 
agacctgcct 
gatcgctgcg 
cggtcaatac 



atgagcaaag 
ctgcctgtat 
ggtggcagga 
cggacgtttt 
ctggattttg 
acatactaag 
cccttatctg 
gggcaggacc 
atgccagttc 
cgcctcgtgc 
gaagccctgt 
ccgctggtgg 
tgccttccag 
ccagggatag 
ctcggtacgg 
cggcatgtcc 
agactcgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcatcttg 
ccttttctac 
ttcccgatat 
ctttgatatt 
cttgctttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagccctttg 
gctccaccat 
gccgattcat 
caacgcaatt 
tccggctcgt 
tgaccatgat 
atacattgat 
tgaaatttgt 

ggtgggcgaa 

ggaaaacgat 
gcacgtgtca 
gggcgaactc 
cccggaagtt 
cccacaccca 
gggtcacgtc 
cgagccggtc 
cggcactggt 
cttcaatcct 
gtctcagaag 
ctcggattcc 
ggcacctaca 
gacagtggtc 
ccaaccacgt 
gtctactcca 
caacaaaggg 
atcaaaagga 
aaggctatcg 
aggagcatcg 
gatatctcca 
ctatataagg 
caaatctatc 
gaactcaccg 
ctgatgcagc 
ggatatgtcc 
cggcactttg 
gagagcctga 
gaaaccgaac 
gccgatctta 
actacatggc 



6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400' 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
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gtgatttcat 
acaccgtcag 
gccccgaagt 
atggccgcat 
aggtcgccaa 
acttcgagcg 
gcattggtct 
gggcgcaggg 
aaatcgcccg 
gtggaaaccg 
afcctgtcgat 
ggaattaggg 
gtatttgtat 
agtactaaaa 
ggccgtcgtt 
tgcagcacat 
ttcccaacag 
tgtcgtttcc 
cctaagagaa 
tccgttcgtc 

<210> 2 
<211> 8428 
<212> DNA 
<213> Artificial Sequence 

<220> 

<223> pCambia3300 plasmid 
<400> 2 

catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc ctccgctgct 60 
atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca 120 
agtccfcaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt 180 
gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac gccatgaaca 240 
agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac caggacttga 300 
ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc gagaagatca 360 
ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta cgccctggcg 42 0 
acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac ctactggaca 480 
ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg 540 
acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg 60 0 
agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg 660 
tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc cgcgagctga 72 0 
tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg catcgctcga 780 
ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg 840 
gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac 900 
gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt ttttcattac 960 
cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt 1020 
ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc tggcggcctg 1080 
gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt gatgtgtatt 1140 
tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag taaataaaca 12 00 
aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg cgggtcaggc 1260 
aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc cgatgttctg 1320 
ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa 13 80 
ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa ggccatcggc 1440 
cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg 1500 
atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga catatgggcc 1560 
accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg aaggctacaa 1620 
gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag 1680 
gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg cgtgagctac 1740 
ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc 1800 
cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt taatgaggta 1860 
aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc gcacgcagca 1920 
gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg gtcaactttc 1980 
agttgccggc ggaggatcac accaagctga agatgtacgc ggtacgccaa ggcaagacca 2040 
ttaccgagct gctatctgaa tacatcgcgc agctaccaga gtaaatgagc aaatgaataa 2100 
atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga acaaccaggc 2160 
accgacgccg tggaatgccc catgtgtgga ggaacgggcg gttggccagg cgtaagcggc 2220 



atgcgcgatt gctgatcccc atgtgtatca 
tgcgtccgtc gcgcaggctc tcgatgagct 
ccggcacctc gtgcacgcgg atttcggctc 
aacagcggtc attgactgga gcgaggcgat 
catcttcttc tggaggccgt ggttggcttg 
gaggcatccg gagcttgcag gatcgccacg 
tgaccaactc tatcagagct tggttgacgg 
tcgatgcgac gcaatcgtcc gatccggagc 
cagaagcgcg gccgtctgga ccgatggctg 
acgccccagc actcgtccga gggcaaagaa 
cgacaagctc gagtttctcc ataataatgt 
ttcctatagg gtttcgctca tgtgttgagc 
ttgtaaaata cttctatcaa taaaatttct 
tccagatccc ccgaattaat tcggcgttaa 
ttacaacgtc gtgactggga aaaccctggc 
ccccctttcg ccagctggcg taatagcgaa 
ttgcgcagcc tgaatggcga atgctagagc 
cgccttcagt ttaaactatc agtgtttgac 
aagagcgttt attagaataa cggatattta 
catttgtatg tg 



ctggcaaact gtgatggacg 10080 
gatgctttgg gccgaggact 10140 
caacaatgtc ctgacggaca 10200 
gttcggggat tcccaatacg 10260 
fcatggagcag cagacgcgct 10320 
actccgggcg tatatgctcc 103 80 
caatttcgat gatgcagctt 10440 
cgggactgtc gggcgtacac 10500 
tgtagaagta ctcgccgata 10560 
atagagtaga tgccgaccgg 10620 
gtgagtagtt cccagataag 10680 
atataagaaa cccttagtat 10740 
aattcctaaa accaaaatcc 10800 
ttcagatcaa gcttggcact 10860 
gttacccaac ttaatcgcct 10920 
gaggcccgca ccgatcgccc 10980 
agcttgagct tggatcagat 11040 
aggatatatt ggcgggtaaa 11100 
aaagggcgtg aaaaggttta 11160 

11182 
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tgggttgtct gccggccctg caatggcact 
cggtcgcaaa ccatccggcc cggtacaaat 
gaagttgaag gccgcgcagg ccgcccagcg 
tgaatcgtgg caagcggccg ctgatcgaat 
cggtgcgccg tcgattagga agccgcccaa 
gatgctctat gacgtgggca cccgcgatag 
tctgtcgaag cgtgaccgac gagctggcga 
cgtagaggtt tccgcagggc cggccggcat 
gatggcggtt tcccatctaa ccgaatccat 
gcccggccgc gtgttccgtc cacacgttgc 
tggcggaaag cagaaagacg acctggtaga 
tgccatgcag cgtacgaaga aggccaagaa 
agccttgatt agccgctaca agatcgtaaa 
gatcgagcta gctgattgga tgtaccgcga 
gacggttcac cccgattact ttttgatcga 
ggcacgccgc gccgcaggca aggcagaagc 
cagtggcagc gccggagagt tcaagaagtt 
aaatgacctg ccggagtacg atttgaagga 
catgcgctac cgcaacctga tcgagggcga 
gatgctaggg caaattgccc tagcagggga 
tagcacgtac attgggaacc caaagccgta 
cccaaagccg tacattggga accggtcaca 
aggcgatttt tccgcctaaa actctttaaa 
ctgtgcataa ctgtctggcc agcgcacagc 
gtcgctgcgc tccctacgcc ccgccgcttc 
aaaaatggct ggcctacggc caggcaatct 
actcgaccgc cggcgcccac atcaaggcac 
aaaacctctg acacatgcag ctcccggaga 
ggagcagaca agcccgtcag ggcgcgtcag 
tgacccagtc acgtagcgat agcggagtgt 
gattgtactg agagtgcacc atatgcggtg 
ataccgcatc aggcgctctt ccgcttcctc 
gctgcggcga gcggtatcag ctcactcaaa 
ggataacgca ggaaagaaca tgtgagcaaa 
ggccgcgttg ctggcgtttt tccataggct 
acgctcaagt cagaggtggc gaaacccgac 
tggaagctcc ctcgtgcgct ctcctgttcc 
ctttctccct tcgggaagcg tggcgctttc 
ggtgtaggtc gttcgctcca agctgggctg 
ctgcgcctta tccggtaact atcgtcttga 
actggcagca gccactggta acaggattag 
gttcttgaag tggtggccta actacggcta 
tctgctgaag ccagttacct tcggaaaaag 
caccgctggt agcggtggtt tttttgtttg 
atctcaagaa gatcctttga tcttttctac 
acgttaaggg attttggtca tgcattctag 
atattttatt ttctcccaat caggcttgat 
ctgttcttcc ccgatatcct ccctgatcga 
gtccgccctg ccgcttctcc caagatcaat 
gatgttgctg tctcccaggt cgccgtggga 
ctttaaaaaa tcatacagct cgcgcggatc 
gcaatccaca tcggccagat cgttattcag 
taagctattc gtatagggac aatccgatat 
cgcatacagc tcgataatct tttcagggct 
gacgccatcg gcctcactca tgagcagatt 
gacctttgga acaggcagct ttccttccag 
atcataggtg gtccctttat accggctgtc 
tcccaccagc ttatatacct tagcaggaga 
tttttcgatc agttttttca attccggtga 
tcctcttttc tacagtattt aaagataccc 
aattcactgt tccttgcatt ctaaaacctt 
ttttcaaagt tggcgtataa catagtatcg 
caggcagcaa cgctctgtca tcgttacaat 
gtttcaaacc cggcagctta gttgccgttc 
tctgccgcct tacaacggct ctcccgctga 
cgagtggtga ttttgtgccg agctgccggt 
tatattgtgg tgtaaacaaa ttgacgctta 



ggaaccccca agcccgagga atcggcgtga 2280 
cggcgcggcg ctgggtgatg acctggtgga 2340 
gcaacgcatc gaggcagaag cacgccccgg 2400 
ccgcaaagaa tcccggcaac cgccggcagc 2460 
gggcgacgag caaccagatt ttttcgttcc 2520 
tcgcagcatc atggacgtgg ccgttttccg 2580 
ggtgatccgc tacgagcttc cagacgggca 2640 
ggccagtgtg tgggattacg acctggtact 2700 
gaaccgatac cgggaaggga agggagacaa 2760 
ggacgtactc aagttctgcc ggcgagccga 2820 
aacctgcatt cggttaaaca ccacgcacgt 2880 
cggccgcctg gtgacggtat ccgagggtga 2940 
gagcgaaacc gggcggccgg agtacatcga 3 000 
gatcacagaa ggcaagaacc cggacgtgct 3060 
tcccggcatc ggccgttttc tctaccgcct 3120 
cagatggttg ttcaagacga tctacgaacg 3180 
ctgtttcacc gtgcgcaagc tgatcgggtc 324 0 
ggaggcgggg caggctggcc cgatcctagt 3300 
agcatccgcc ggttcctaat gtacggagca 3360 
aaaaggtcga aaaggtctct ttcctgtgga 3420 
cattgggaac cggaacccgt acattgggaa 3480 
catgtaagtg actgatataa aagagaaaaa 3540 
acttattaaa actcttaaaa cccgcctggc 3600 
cgaagagctg caaaaagcgc ctacccttcg 3 660 
gcgtcggcct atcgcggccg ctggccgctc 3 720 
accagggcgc ggacaagccg cgccgtcgcc 3780 
cctgcctcgc gcgtttcggt gatgacggtg 3840 
cggtcacagc ttgtctgtaa gcggatgccg 3900 
cgggtgttgg cgggtgtcgg ggcgcagcca 3960 
atactggctt aactatgcgg catcagagca 4020 
tgaaataccg cacagatgcg taaggagaaa 4080 
gctcactgac tcgctgcgct cggtcgttcg 4140 
ggcggtaata cggttatcca cagaatcagg 4200 
aggccagcaa aaggccagga accgtaaaaa 4260 
ccgcccccct gacgagcatc acaaaaatcg 4320 
aggactataa agataccagg cgtttccccc 4380 
gaccctgccg cttaccggat acctgtccgc 4440 
tcatagctca cgctgtaggt atctcagttc 4500 
tgtgcacgaa ccccccgttc agcccgaccg 4560 
gtccaacccg gtaagacacg acttatcgcc 4620 
cagagcgagg tatgtaggcg gtgctacaga 4680 
cactagaagg acagtatttg gtatctgcgc 4740 
agttggtagc tcttgatccg gcaaacaaac 4800 
caagcagcag attacgcgca gaaaaaaagg 4 860 
ggggtctgac gctcagtgga acgaaaactc 4920 
gtactaaaac aattcatcca gtaaaatata 4980 
ccccagtaag tcaaaaaata gctcgacata 5040 
ccggacgcag aaggcaatgt cataccactt 5100 
aaagccactt actttgccat ctttcacaaa 5160 
aaagacaagt tcctcttcgg gcttttccgt 5220 
tttaaatgga gtgtcttctt cccagttttc 5280 
taagtaatcc aattcggcta agcggctgtc 5340 
gtcgatggag tgaaagagcc tgatgcactc 5400 
ttgttcatct tcatactctt ccgagcaaag 5460 
gctccagcca tcatgccgtt caaagtgcag 5520 
ccatagcatc atgtcctttt cccgttccac 5580 
cgtcattttt aaatataggt tttcattttc 5640 
cattccttcc gtatctttta cgcagcggta 5700 
tattctcatt ttagccattt attatttcct 5760 
caagaagcta attataacaa gacgaactcc 5820 
aaataccaga aaacagcttt ttcaaagttg 5880 
acggagccga ttttgaaacc gcggtgatca 5940 
caacatgcta ccctccgcga gatcatccgt 6000 
ttccgaatag catcggtaac atgagcaaag 60 60 
cgccgtcccg gactgatggg ctgcctgtat 6120 
cggggagctg ttggctggct ggtggcagga 6180 
gacaacttaa taacacattg cggacgtttt 6240 
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taatgtactg 
gttttaggaa 
ggtttcttat 
ggaactactc 
ggacggggcg 
ccgtgcttga 
atgcgcacgc 
gcctccaggg 
cggggggaga 
gggcccgcgt 
cgctcccgca 
aagttgaccg 
gcctcggtgg 
gagatagatt 
ttccttatat 
agtggagata 
cacgatgctc 
aacgatagcc 
tgtccttttg 
taccctttgt 
cttggagtag 
agacgtggtt 
gggaccactg 
tttgtaggtg 
atggaatccg 
gtcttctgag 
gttggcaagc 
taatgcagct 
aatgtgagtt 
atgttgtgtg 
tacgaattcg 
ggcactggcc 
tcgccttgca 
tcgcccttcc 
tcagattgtc 
ggtaaaccta 
ggtttatccg 



aattaacgcc 
ttagaaattt 
atgctcaaca 
acacattatt 
gtaccggcag 
agccggccgc 
tcgggtcgtt 
acttcagcag 
cgtacacggt 
aggcgatgcc 
gacggacgag 
tgcttgtctc 
cacggcggat 
tgtagagaga 
agaggaaggt 
tcacatcaat 
ctcgtgggtg 
tttcctttat 
atgaagtgac 
tgaaaagtct 
acgagagtgt 
ggaacgtctt 
tcggcagagg 
ccaccttcct 
aggaggtttc 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaattgtgag 
agctcggtac 
gtcgttttac 
gcacatcccc 
caacagttgc 
gtttcccgcc 
agagaaaaga 
ttcgtccatt 



gaattaattc 
tattgataga 
catgagcgaa 
atggagaaac 
gctgaagtcc 
ccgcagcatg 
gggcagcccg 
gtgggtgtag 
cgactcggcc 
ggcgacctcg 
gtcgtccgtc 
gatgtagtgg 
gtcggccggg 
gactggtgat 
cttgcgaagg 
ccacttgctt 
ggggtccatc 
cgcaatgatg 
agatagctgg 
caatagccct 
cgtgctccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatattac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
cggataacaa 
ccggggatcc 
aacgtcgtga 
ctttcgccag 
gcagcctgaa 
ttcagtttaa 
gcgtttatta 
tgtatgtg 
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gggggatctg 
agtattttac 
accctatagg 
tcgagtcaaa 
agctgccaga 
ccgcgggggg 
atgacagcga 
agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
ttgacgatgg 
cgtcgttctg 
ttcagcgtgt 
atagtgggat 
tgaagacgtg 
tttgggacca 
gcatttgtag 
gcaatggaat 
ttggtcttct 
catgttatca 
gatgctcctc 
gatagccttt 
ccttttgatg 
cctttgttga 
ggagtagacg 
ccgcctctcc 
tggaaagcgg 
caggctttac 
tttcacacag 
tctagagtcg 
ctgggaaaac 
ctggcgtaat 
tggcgaatgc 
actatcagtg 
gaataacgga 



gattttagta 
aaatacaaat 
aaccctaatt 
tctcggtgac 
aacccacgtc 
catatccgag 
ccacgctctt 
ccagtcccgt 
aggcgttgcg 
cggcgacgag 
gttcctgcgg 
tgcagaccgc 
ggctcatggt 
cctctccaaa 
tgtgcgtcat 
gttggaacgt 
ctgtcggcag 
gtgccacctt 
ccgaggaggt 
gagactgtat 
catcaatcca 
gtgggtgggg 
cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatgct 
gaaacagcta 
acctgcaggc 
cctggcgtta 
agcgaagagg 
tagagcagct 
tttgacagga 
tatttaaaag 



ctggattttg 
acatactaag 
cccttatctg 
gggcaggacc 
atgccagttc 
cgcctcgtgc 
gaagccctgt 
ccgctggtgg 
tgccttccag 
ccagggatag 
ctcggtacgg 
cggcatgtcc 
agactcgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcatcttg 
ccttttctac 
ttcccgatat 
ctttgatatt 
cttgctttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagccctttg 
gctccaccat 
gccgattcat 
caacgcaatt 
tccggctcgt 
tgaccatgat 
atgcaagctt 
cccaacttaa 
cccgcaccga 
tgagcttgga 
tatattggcg 
ggcgtgaaaa 



6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8428 



<210> 3 
<211> 10549 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCambial302 plasmid 
<300> 

<308> Genbank #AF234298 
<309> 2000-04-24 



<400> 3 

catggtagat 

tgaattagat 

tgcaacatac 

gtggccaaca 

tcatatgaag 

gaccatcttc 

agacaccctc 

cctcggccac 

gcaaaagaac 

gcaactcgct 

agacaaccat 

ccacatggtc 

atacaaagct 

ccgatcgttc 

cgatgattat 

gcatgacgtt 



ctgactagta 
ggtgatgtta 
ggaaaactta 
cttgtcacta 
cggcacgact 
ttcaaggacg 
gtcaacagga 
aagttggaat 
ggcatcaaag 
gatcattatc 
tacctgtcca 
cttcttgagt 
agccaccacc 
aaacatttgg 
catataattt 
atttatgaga 



aaggagaaga 
atgggcacaa 
cccttaaatt 
ctttctctta 
tcttcaagag 
acgggaacta 
tcgagcttaa 
acaactacaa 
ccaacttcaa 
aacaaaatac 
cacaatctgc 
ttgtaacagc 
accaccacca 
caataaagtt 
ctgttgaatt 
tgggttttta 



acttttcact 
attttctgtc 
tatttgcact 
tggtgttcaa 
cgccatgcct 
caagacacgt 
gggaatcgat 
ctcccacaac 
gacccgccac 
tccaattggc 
cctttcgaaa 
tgctgggatt 
cgtgtgaatt 
tcttaagatt 
acgttaagca 
tgattagagt 



ggagttgtcc 
agtggagagg 
actggaaaac 
tgcttttcaa 
gagggatacg 
gctgaagtca 
ttcaaggagg 
gtatacatca 
aacatcgaag 
gatggccctg 
gatcccaacg 
acacatggca 
ggtgaccagc 
gaatcctgtt 
tgtaataatt 
cccgcaatta 



caattcttgt 60 
gtgaaggtga 120 
tacctgttcc 180 
gatacccaga 240 
tgcaggagag 300 
agtttgaggg 3 60 
acggaaacat 420 
tggccgacaa 480 
acggcggcgt 540 
tccttttacc 600 
aaaagagaga 660 
tggatgaact 720 
tcgaatttcc 780 
gccggtcttg 840 
aacatgtaat 900 
tacatttaat 960 
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acgcgataga aaacaaaata tagcgcgcaa 
ctatgttact agatcgggaa ttaaactatc 
cctaagagaa aagagcgttt attagaataa 
tccgttcgtc catttgtatg tgcatgccaa 
ttgatccaac ccctccgctg ctatagtgca 
tctgaaaacg acatgtcgca caagtcctaa 
tcctggcgtt ttcttgtcgc gtgttttagt 
cggagacatt acgccatgaa caagagcgcc 
agcaccgacg accaggactt gaccaaccaa 
aagctgtttt ccgagaagat caccggcacc 
cttgaccacc tacgccctgg cgacgttgtg 
agcacccgcg acctactgga cattgccgag 
agcctggcag agccgtgggc cgacaccacc 
ttcgccggca ttgccgagtt cgagcgttcc 
gaggccgcca aggcccgagg cgtgaagttt 
atcgcgcacg cccgcgagct gatcgaccag 
ctgcttggcg tgcatcgctc gaccctgtac 
cccaccgagg ccaggcggcg cggtgccttc 
ctggcggccg ccgagaatga acgccaagag 
aggacgaacc gtttttcatt accgaagaga 
tgttcgagcc gcccgcgcac gtctcaaccg 
ctgatgccaa gctggcggcc tggccggcca 
gtctaaaaag gtgatgtgta tttgagtaaa 
tgatgcgatg agtaaataaa caaatacgca 
taaccagaaa ggcgggtcag gcaagacgac 
actcgccggg gccgatgttc tgttagtcga 
ggcggccgtg cgggaagatc aaccgctaac 
ccgcgacgtg aaggccatcg gccggcgcga 
ggcggacttg gctgtgtccg cgatcaaggc 
aagcccttac gacatatggg ccaccgccga 
ggtcacggat ggaaggctac aagcggcctt 
catcggcggt gaggttgccg aggcgctggc 
tatcacgcag cgcgtgagct acccaggcac 
agaacccgag ggcgacgctg cccgcgaggt 
actcatttga gttaatgagg taaagagaaa 
ggccgtccga gcgcacgcag cagcaaggct 
gccatgaagc gggtcaactt tcagttgccg 
gcggtacgcc aaggcaagac cattaccgag 
gagtaaatga gcaaatgaat aaatgagtag 
ggaaaatcaa gaacaaccag gcaccgacgc 
cggttggcca ggcgtaagcg gctgggttgt 
caagcccgag gaatcggcgt gacggtcgca 
cgctgggtga tgacctggtg gagaagttga 
tcgaggcaga agcacgcccc ggtgaatcgt 
aatcccggca accgccggca gccggtgcgc 
agcaaccaga ttttttcgtt ccgatgctct 
tcatggacgt ggccgttttc cgtctgtcga 
gctacgagct tccagacggg cacgtagagg 
tgtgggatta cgacctggta ctgatggcgg 
accgggaagg gaagggagac aagcccggcc 
tcaagttctg ccggcgagcc gatggcggaa 
ttcggttaaa caccacgcac gttgccatgc 
tggtgacggt atccgagggt gaagccttga 
ccgggcggcc ggagtacatc gagatcgagc 
aaggcaagaa cccggacgtg ctgacggttc 
tcggccgttt tctctaccgc ctggcacgcc 
tgttcaagac gatctacgaa cgcagtggca 
ccgtgcgcaa gctgatcggg tcaaatgacc 
ggcaggctgg cccgatccta gtcatgcgct 
ccggttccta atgtacggag cagatgctag 
gaaaaggtct ctttcctgtg gatagcacgt 
accggaaccc gtacattggg aacccaaagc 
tgactgatat aaaagagaaa aaaggcgatt 
aaactcttaa aacccgcctg gcctgtgcat 
tgcaaaaagc gcctaccctt cggtcgctgc 
ctatcgcggc cgctggccgc tcaaaaatgg 
gcggacaagc cgcgccgtcg ccactcgacc 
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actaggataa attatcgcgc gcggtgtcat 1020 
agtgtttgac aggatatatt ggcgggtaaa 1080 
cggatattta aaagggcgtg aaaaggttta 1140 
ccacagggtt cccctcggga tcaaagtact 1200 
gtcggcttct gacgttcagt gcagccgtct 1260 
gttacgcgac aggctgccgc cctgcccttt 1320 
cgcataaagt agaatacttg cgactagaac 1380 
gccgctggcc tgctgggcta tgcccgcgtc 1440 
cgggccgaac tgcacgcggc cggctgcacc 1500 
aggcgcgacc gcccggagct ggccaggatg 1560 
acagtgacca ggctagaccg cctggcccgc 1620 
cgcatccagg aggccggcgc gggcctgcgt 1680 
acgccggccg gccgcatggt gttgaccgtg 1740 
ctaatcatcg accgcacccg gagcgggcgc 1800 
ggcccccgcc ctaccctcac cccggcacag 1860 
gaaggccgca ccgtgaaaga ggcggctgca 1920 
cgcgcacttg agcgcagcga ggaagtgacg 1980 
cgtgaggacg cattgaccga ggccgacgcc 2040 
gaacaagcat gaaaccgcac caggacggcc 2100 
tcgaggcgga gatgatcgcg gccgggtacg 2160 
tgcggctgca tgaaatcctg gccggtttgt 2220 
gcttggccgc tgaagaaacc gagcgccgcc 22 80 
acagcttgcg tcatgcggtc gctgcgtata 2340 
aggggaacgc atgaaggtta tcgctgtact 2400 
catcgcaacc catctagccc gcgccctgca 2460 
ttccgatccc cagggcagtg cccgcgattg 2520 
cgttgtcggc atcgaccgcc cgacgattga 2580 
cttcgtagtg atcgacggag cgccccaggc 2640 
agccgacttc gtgctgattc cggtgcagcc 2700 
cctggtggag ctggttaagc agcgcattga 2760 
tgtcgtgtcg cgggcgatca aaggcacgcg 2 82 0 
cgggtacgag ctgcccattc ttgagtcccg 2 880 
tgccgccgcc ggcacaaccg ttcttgaatc 2940 
ccaggcgctg gccgctgaaa ttaaatcaaa 3 000 
atgagcaaaa gcacaaacac gctaagtgcc 3060 
gcaacgttgg ccagcctggc agacacgcca 3120 
gcggaggatc acaccaagct gaagatgtac 3180 
ctgctatctg aatacatcgc gcagctacca 3240 
atgaatttta gcggctaaag gaggcggcat 3300 
cgtggaatgc cccatgtgtg gaggaacggg 3360 
ctgccggccc tgcaatggca ctggaacccc 3420 
aaccatccgg cccggtacaa atcggcgcgg 3480 
aggccgcgca ggccgcccag cggcaacgca 3540 
ggcaagcggc cgctgatcga atccgcaaag 3 600 
cgtcgattag gaagccgccc aagggcgacg 3660 
atgacgtggg cacccgcgat agtcgcagca 3720 
agcgtgaccg acgagctggc gaggtgatcc 3780 
tttccgcagg gccggccggc atggccagtg 3 840 
tttcccatct aaccgaatcc atgaaccgat 3900 
gcgtgttccg tccacacgtt gcggacgtac 3960 
agcagaaaga cgacctggta gaaacctgca 4020 
agcgtacgaa gaaggccaag aacggccgcc 4080 
ttagccgcta caagatcgta aagagcgaaa 4140 
tagctgattg gatgtaccgc gagatcacag 4200 
accccgatta ctttttgatc gatcccggca 4260 
gcgccgcagg caaggcagaa gccagatggt 4320 
gcgccggaga gttcaagaag ttctgtttca 4380 
tgccggagta cgatttgaag gaggaggcgg 4440 
accgcaacct gatcgagggc gaagcatccg 4500 
ggcaaattgc cctagcaggg gaaaaaggtc 4560 
acattgggaa cccaaagccg tacattggga 4620 
cgtacattgg gaaccggtca cacatgtaag 4680 
tttccgccta aaactcttta aaacttatta 4740 
aactgtctgg ccagcgcaca gccgaagagc 4800 
gctccctacg ccccgccgct tcgcgtcggc 4 860 
ctggcctacg gccaggcaat ctaccagggc 4920 
gccggcgccc acatcaaggc accctgcctc 4980 
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gcgcgtttcg gtgatgacgg tgaaaacctc tgacacatgc agctcccgga gacggtcaca 5040 
gcttgtctgt aagcggatgc cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt 5100 
ggcgggtgtc ggggcgcagc catgacccag tcacgtagcg atagcggagt gtatactggc 5160 
ttaactatgc ggcatcagag cagattgtac tgagagtgca ccatatgcgg tgtgaaatac 5220 
cgcacagatg cgtaaggaga aaataccgca tcaggcgctc ttccgcttcc tcgctcactg 5280 
actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa 5340 
tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc 5400 
aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc 5460 
ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat 5520 
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc 5580 
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct 5640 
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg 5700 
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc 5760 
cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga 5820 
ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa 5880 
ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta 5940 
gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc 6000 
agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg 6060 
acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgcattct aggtactaaa 6120 
acaattcatc cagtaaaata taatatttta ttttctccca atcaggcttg atccccagta 6180 
agtcaaaaaa tagctcgaca tactgttctt ccccgatatc ctccctgatc gaccggacgc 6240 
agaaggcaat gtcataccac ttgtccgccc tgccgcttcfc cccaagatca ataaagccac 6300 
ttactttgcc atctttcaca aagatgttgc tgtctcccag gtcgccgtgg gaaaagacaa 6360 
gttcctcttc gggcttttcc gtctttaaaa aatcatacag ctcgcgcgga tctttaaatg 6420 
gagtgtcttc ttcccagttt tcgcaatcca catcggccag atcgttattc agtaagtaat 6480 
ccaattcggc taagcggctg tctaagctat tcgtataggg acaatccgat atgtcgatgg 6540 
agtgaaagag cctgatgcac tccgcataca gctcgataat cttttcaggg ctttgttcat 660 0 
ct teat act c ttccgagcaa aggacgecat cggcctcact catgagcaga ttgctccagc 6660 
catcatgccg ttcaaagtgc aggacctttg gaacaggcag ctttccttcc agecatagea 6720 
tcatgtcctt ttcccgttcc acatcatagg tggtcccttt ataceggctg teegtcattt 6780 
ttaaatatag gttttcattt tctcccacca gcttatatac cttagcagga gacattcctt 6840 
ccgtatcttt tacgeagegg tatttttcga tcagtttttt caattceggt gatattctca 690 0 
ttttagecat ttattatttc cttcctcttt tctacagtat ttaaagatac cccaagaagc 6960 
taattataac aagacgaact ccaattcact gttccttgea ttctaaaacc ttaaatacca 702 0 
gaaaacagct ttttcaaagt tgttttcaaa gttggcgtat aacatagtat cgacggagcc 7080 
gattttgaaa ccgcggtgat cacaggcagc aacgetctgt categttaca ateaacatge 7140 
taccctccgc gagatcatcc gtgtttcaaa cccggcagct tagttgccgt tcttccgaat 7200 
ageateggta acatgagcaa agtctgccgc ettacaaegg ctctcccgct gacgccgtcc 7260 
eggactgatg ggctgcctgt atcgagtggt gattttgtgc cgagctgccg gteggggage 732 0 
tgttggctgg ctggtggcag gatatattgt ggtgtaaaca aattgacget tagacaactt 7380 
aataacacat tgcggacgtt tttaatgtac tgaattaacg ccgaattaat tegggggate 7440 
tggattttag tactggattt tggttttagg aattagaaat tttattgata gaagtatttt 7500 
acaaatacaa atacatacta agggtttctt atatgetcaa cacatgagcg aaaccctata 7560 
ggaaccctaa ttcccttatc tgggaactac tcacacatta ttatggagaa actcgagctt 7620 
gtcgatcgac agatceggtc ggcatctact ctatttcttt gccctcggac gagtgctggg 7680 
gcgtcggttt ccactatcgg cgagtacttc tacacagcca teggtccaga cggccgcgct 7740 
tetgegggeg atttgtgtac gcccgacagt cccggctccg gateggaega ttgegtcgea 7800 
tcgaccctgc gcccaagctg catcatcgaa attgeegtea accaagctct gatagagttg 7860 
gtcaagacca atgeggagea tatacgcccg gagtcgtggc gatcctgeaa getceggatg 792 0 
cctccgctcg aagtagcgcg tctgctgctc catacaagcc aaccacggcc tccagaagaa 7980 
gatgttggcg acctegtatt gggaatcccc gaacatcgcc tcgctccagt caatgaccgc 8040 
tgttatgcgg ccattgtccg tcaggacatt gttggagccg aaatccgcgt gcacgaggtg 8100 
ccggacttcg gggcagtcct cggcccaaag catcagctca tegagagect gcgcgacgga 8160 
cgcactgacg gtgtcgtcca tcacagtttg ccagtgatac acatggggat cagcaatcgc 822 0 
gcatatgaaa tcacgccatg tagtgtattg accgattcct tgeggtcega atgggccgaa 8280 
cccgctcgtc tggctaagat cggccgcagc gatcgcatcc atagcctccg cgaccggttg 8340 
tagaacagcg ggcagttcgg tttcaggcag gtcttgeaac gtgacaccct gtgcacggcg 8400 
ggagatgcaa taggtcaggc tetegctaaa ctccccaatg tcaagcactt ceggaategg 8460 
gagcgcggcc gatgeaaagt gecgataaac ataacgatct ttgtagaaac catcggcgca 8520 
gctatttacc cgcaggacat atccacgccc tcctacatcg aagctgaaag cacgagattc 8580 
ttcgccctcc gagagctgea teaggtegga gaegctgteg aacttttcga tcagaaactt 864 0 
ctcgacagac gtcgcggtga gttcaggctt tttcatatct cattgccccc egggatctge 8700 
gaaagctcga gagagataga tttgtagaga gagactggtg atttcagegt gtcctctcca 8760 
aatgaaatga acttccttat atagaggaag gtcttgcgaa ggatagtggg attgtgcgtc 8820 
atcccttacg tcagtggaga tatcacatca atccacttgc tttgaagacg tggttggaac 8880 
gtcttctttt tccacgatgc tcctcgtggg tgggggtcca tctttgggac cactgtcggc 8940 
agaggcatct tgaacgatag cctttccttt ategcaatga tggcatttgt aggtgccacc 9000 
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ttccttttct actgtccttt tgatgaagtg acagatagct gggcaatgga atccgaggag 9060 
gtttcccgat attacccttt gttgaaaagt ctcaatagcc ctttggtctt ctgagactgt 9120 
atctttgata ttcttggagt agacgagagt gtcgtgctcc accatgttat cacatcaatc 9180 
cacttgcttt gaagacgtgg ttggaacgtc ttctttttcc acgatgctcc tcgtgggtgg 9240 
gggtccatct ttgggaccac tgtcggcaga ggcatcttga acgatagcct ttcctttatc 9300 
gcaatgatgg catttgtagg tgccaccttc cttttctact gtccttttga tgaagtgaca 9360 
gatagctggg caatggaatc cgaggaggtt tcccgatatt accctttgtt gaaaagtctc 9420 
aatagccctt tggtcttctg agactgtatc tttgatattc ttggagtaga cgagagtgtc 9480 
gtgctccacc atgttggcaa gctgctctag ccaatacgca aaccgcctct ccccgcgcgt 9540 
tggccgattc attaatgcag ctggcacgac aggtttcccg actggaaagc gggcagtgag 9600 
cgcaacgcaa ttaatgtgag ttagctcact cattaggcac cccaggcttt acactttatg 9660 
cttccggctc gtatgttgtg tggaattgtg agcggataac aatttcacac aggaaacagc 9720 
tatgaccatg attacgaatt cgagctcggt acccggggat cctctagagt cgacctgcag 9780 
gcatgcaagc ttggcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt 9840 
tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga 9900 
ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat gctagagcag 9960 
cttgagcttg gatcagattg tcgtttcccg ccttcagttt agcttcatgg agtcaaagat 10020 
tcaaatagag gacctaacag aactcgccgt aaagactggc gaacagttca tacagagtct 10080 
cttacgactc aatgacaaga agaaaatctt cgtcaacatg gtggagcacg acacacttgt 10140 
ctactccaaa aatatcaaag atacagtctc agaagaccaa agggcaattg agacttttca 10200 
acaaagggta atatccggaa acctcctcgg attccattgc ccagctatct gtcactttat 10260 
tgtgaagata gtggaaaagg aaggtggctc ctacaaatgc catcattgcg ataaaggaaa 10320 
ggccatcgtt gaagatgcct ctgccgacag tggtcccaaa gatggacccc cacccacgag 103 80 
gagcatcgtg gaaaaagaag acgttccaac cacgtcttca aagcaagtgg attgatgtga 10440 
tatctccact gacgtaaggg atgacgcaca atcccactat ccttcgcaag acccttcctc 10500 
tatataagga agttcatttc atttggagag aacacggggg actcttgac 10549 

<210> 4 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> CaMV35SpolyA Primer 
<400> 4 

ctgaattaac gccgaattaa tfccgggggat ctg 33 

<210> 5 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> CaMV35Spr Primer 
<400> 5 

ctagagcagc ttgccaacat ggtggagca 29 

<210> 6 
<211> 12592 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAg2 Plasmid 
<400> 6 

gtacgaagaa ggccaagaac ggccgcctgg tgacggtatc cgagggtgaa gccttgatta 60 
gccgctacaa gatcgtaaag agcgaaaccg ggcggccgga gtacatcgag atcgagctag 12 0 
ctgattggat gtaccgcgag atcacagaag gcaagaaccc ggacgtgctg acggttcacc 180 
ccgattactt tttgatcgat cccggcatcg gccgttttct ctaccgcctg gcacgccgcg 240 
ccgcaggcaa ggcagaagcc agatggttgt tcaagacgat ctacgaacgc agtggcagcg 300 
ccggagagtt caagaagttc tgtttcaccg tgcgcaagct gatcgggtca aatgacctgc 360 
cggagtacga tttgaaggag gaggcggggc aggctggccc gatcctagtc atgcgctacc 420 
gcaacctgat cgagggcgaa gcatccgccg gttcctaatg tacggagcag atgctagggc 480 
aaattgccct agcaggggaa aaaggtcgaa aaggtctctt tcctgtggat agcacgtaca 540 
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ttgggaaccc aaagccgtac attgggaacc ggaacccgta cattgggaac ccaaagccgt 600 
acattgggaa ccggtcacac atgtaagtga ctgatataaa agagaaaaaa ggcgattttt 660 
ccgcctaaaa ctctttaaaa cttattaaaa ctcttaaaac ccgcctggcc tgtgcataac 720 
tgtctggcca gcgcacagcc gaagagctgc aaaaagcgcc tacccttcgg tcgctgcgct 780 
ccctacgccc cgccgcttcg cgtcggccta tcgcggccgc tggccgctca aaaatggctg 840 
gcctacggcc aggcaatcta ccagggcgcg gacaagccgc gccgtcgcca ctcgaccgcc 900 
ggcgcccaca tcaaggcacc ctgcctcgcg cgtttcggtg atgacggtga aaacctctga 960 
cacatgcagc tcccggagac ggtcacagct tgtctgtaag cggatgccgg gagcagacaa 1020 
gcccgtcagg gcgcgtcagc gggtgttggc gggtgtcggg gcgcagccat gacccagtca 1080 
cgtagcgata gcggagtgta tactggctta actatgcggc atcagagcag attgtactga 1140 
gagtgcacca tatgcggtgt gaaataccgc acagatgcgt aaggagaaaa taccgcatca 120 0 
ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag 1260 
cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg gataacgcag 1320 
gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc 1380 
tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga cgctcaagtc 1440 
agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 1500 
tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 1560 
cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 1620 
ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 1680 
ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 1740 
ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 1800 
ggtggcctaa ctacggctac actagaagga cagtatttgg tatctgcgct ctgctgaagc 1860 
cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta 1920 
gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 1980 
atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca cgttaaggga 2 040 
ttttggtcat gcattctagg tactaaaaca attcatccag taaaatataa tattttattt 2100 
tctcccaatc aggcttgatc cccagtaagt caaaaaatag ctcgacatac tgttcttccc 2160 
cgatatcctc cctgatcgac cggacgcaga aggcaatgtc ataccacttg tccgccctgc 2220 
cgcttctccc aagatcaata aagccactta ctttgccatc tttcacaaag atgttgctgt 2280 
ctcccaggtc gccgtgggaa aagacaagtt cctcttcggg cttttccgtc tttaaaaaat 2340 
catacagctc gcgcggatct ttaaatggag tgtcttcttc ccagttttcg caatccacat 2400 
cggccagatc gttattcagt aagtaatcca attcggctaa gcggctgtct aagctattcg 2460 
tatagggaca atccgatatg tcgatggagt gaaagagcct gatgcactcc gcatacagct 2520 
cgataatctt ttcagggctt tgttcatctt catactcttc cgagcaaagg acgccatcgg 2580 
cctcactcat gagcagattg ctccagccat catgccgttc aaagtgcagg acctttggaa 2640 
caggcagctt tccttccagc catagcatca tgtccttttc ccgttccaca tcataggtgg 2700 
tccctttata ccggctgtcc gtcattttta aatataggtt ttcattttct cccaccagct 2760 
tatatacctt agcaggagac attccttccg tatcttttac gcagcggtat ttttcgatca 2820 
gttttttcaa ttccggtgat attctcattt tagccattta ttatttcctt cctcttttct 2880 
acagtattta aagatacccc aagaagctaa ttataacaag acgaactcca attcactgtt 2940 
ccttgcattc taaaacctta aataccagaa aacagctttt tcaaagttgt tttcaaagtt 3000 
ggcgtataac atagtatcga cggagccgat tttgaaaccg cggtgatcac aggcagcaac 3060 
gctctgtcat cgttacaatc aacatgctac cctccgcgag atcatccgtg tttcaaaccc 3120 
ggcagcttag ttgccgttct tccgaatagc atcggtaaca tgagcaaagt ctgccgcctt 3180 
acaacggctc tcccgctgac gccgtcccgg actgatgggc tgcctgtatc gagtggtgat 3240 
tttgtgccga gctgccggtc ggggagctgt tggctggctg gtggcaggat atattgtggt 3300 
gtaaacaaat tgacgcttag acaacttaat aacacattgc ggacgttttt aatgtactga 3360 
attaacgccg aattaattcg ggggatctgg attttagtac tggattttgg ttttaggaat 3420 
tagaaatttt attgatagaa gtattttaca aatacaaata catactaagg gtttcttata 3480 
tgctcaacac atgagcgaaa ccctatagga accctaattc ccttatctgg gaactactca 354 0 
cacattatta tggagaaact cgagtcaaat ctcggtgacg ggcaggaccg gacggggcgg 3600 
taccggcagg ctgaagtcca gctgccagaa acccacgtca tgccagttcc cgtgcttgaa 3660 
gccggccgcc cgcagcatgc cgcggggggc atatccgagc gcctcgtgca tgcgcacgct 3720 
cgggtcgttg ggcagcccga tgacagcgac cacgctcttg aagccctgtg cctccaggga 3780 
cttcagcagg tgggtgtaga gcgtggagcc cagtcccgtc cgctggtggc ggggggagac 3840 
gtacacggtc gactcggccg tccagtcgta ggcgttgcgt gccttccagg ggcccgcgta 3900 
ggcgatgccg gcgacctcgc cgtccacctc ggcgacgagc cagggatagc gctcccgcag 3960 
acggacgagg tcgtccgtcc actcctgcgg ttcctgcggc tcggtacgga agttgaccgt 4020 
gcttgtctcg atgtagtggt tgacgatggt gcagaccgcc ggcatgtccg cctcggtggc 4080 
acggcggatg tcggccgggc gtcgttctgg gctcatggta gactcgagag agatagattt 414 0 
gtagagagag actggtgatt tcagcgtgtc ctcfcccaaat gaaatgaact tccttatata 4200 
gaggaaggtc ttgcgaagga tagtgggatt gtgcgtcatc ccttacgtca gtggagatat 4260 
cacatcaatc cacttgcttt gaagacgtgg ttggaacgtc ttctttttcc acgatgctcc 432 0 
tcgtgggtgg gggtccatct ttgggaccac tgtcggcaga ggcatcttga acgatagcct 4380 
ttcctttatc gcaatgatgg catttgtagg tgccaccttc cttttctact gtccttttga 4440 
tgaagtgaca gatagctggg caatggaatc cgaggaggtt tcccgatatt accctttgtt 4500 
gaaaagtctc aatagccctt tggtcttctg agactgtatc tttgatattc ttggagtaga 4560 
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cgagagtgtc gtgctccacc 
gaacgtcttc tttttccacg 
cggcagaggc atcttgaacg 
caccttcctt ttctactgtc 
ggaggtttcc cgatattacc 
ctgtatcttt gatattcttg 
gctctagcca atacgcaaac 
gcacgacagg tttcccgact 
gctcactcat taggcacccc 
aattgtgagc ggataacaat 
gccttgacta gagggtcgac 
aaccacaact agaatgcagt 
tttatttgta accattataa 
tgagatcccc gcgctggagg 
acctttcata gaaggcggcg 
cggccacgaa gtgcacgcag 
gctgctcgcc gatctcggtc 
cctccgacca ctcggcgtac 
tgtccggcac cacctggtcc 
caccggcgaa gtcgtcctcc 
cgaccgctcc ggcgacgtcg 
tggatccaga tttcgctcaa 
atcgacactc tcgtctactc 
attgagactt ttcaacaaag 
atctgtcact tcatcaaaag 
tgcgataaag gaaaggctat 
cccccaccca cgaggagcat 
gtggattgat gtgataacat 
gatacagtct cagaagacca 
aacctcctcg gattccattg 
gaaggtggca cctacaaatg 
tctgccgaca gtggtcccaa 
gacgttccaa ccacgtcttc 
gatgacgcac aatcccacta 
atttggagag gacacgctga 
ttcgcagatc cgggggggca 
gagaagtttc tgatcgaaaa 
gaagaatctc gtgctttcag 
agctgcgccg atggtttcta 
ctcccgattc cggaagtgct 
tcccgccgtg cacagggtgt 
ctacaaccgg tcgcggaggc 
gggttcggcc cattcggacc 
tgcgcgattg ctgatcccca 
gcgtccgtcg cgcaggctct 
cggcacctcg tgcacgcgga 
acagcggtca ttgactggag 
atcttcttct ggaggccgtg 
aggcatccgg agcttgcagg 
gaccaactct atcagagctt 
cgatgcgacg caatcgtccg 
agaagcgcgg ccgtctggac 
cgccccagca ctcgtccgag 
gacaagctcg agtttctcca 
tcctataggg tttcgctcat 
tgtaaaatac ttctatcaat 
ccagatcccc cgaattaatt 
tacaacgtcg tgactgggaa 
cccctttcgc cagctggcgt 
tgcgcagcct gaatggcgaa 
gccttcagtt tggggatcct 
agaattaagg gagtcacgtt 
tggaactgac agaaccgcaa 
tgagctaagc acatacgtca 
atcagctagc aaatatttct 
gtatccaatt agagtctcat 
atcgaattcc cgcggccgcc 
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atgttatcac atcaatccac 
atgctcctcg tgggtggggg 
atagcctttc ctttatcgca 
cttttgatga agtgacagat 
ctttgttgaa aagtctcaat 
gagtagacga gagtgtcgtg 
cgcctctccc cgcgcgttgg 
ggaaagcggg cagtgagcgc 
aggctttaca ctttatgctt 
ttcacacagg aaacagctat 
ggtatacaga catgataaga 
gaaaaaaatg ctttatttgt 
gctgcaataa acaagttggg 
atcatccagc cggcgtcccg 
gtggaatcga aatctcgtag 
ttgccggccg ggtcgcgcag 
atggccggcc cggaggcgtc 
agctcgtcca ggccgcgcac 
tggaccgcgc tgatgaacag 
acgaagtccc gggagaaccc 
cgcgcggtga gcaccggaac 
gttagtataa aaaagcaggc 
caagaatatc aaagatacag 
ggtaatatcg ggaaacctcc 
gacagtagaa aaggaaggtg 
cgttcaagat gcctctgccg 
cgtggaaaaa gaagacgttc 
ggtggagcac gacactctcg 
aagggctatt gagacttttc 
cccagctatc tgtcacttca 
ccatcattgc gataaaggaa 
agatggaccc ccacccacga 
aaagcaagtg gattgatgtg 
tccttcgcaa gaccttcctc 
aatcaccagt ctctctctac 
atgagatatg aaaaagcctg 
gttcgacagc gtctccgacc 
cttcgatgta ggagggcgtg 
caaagatcgt tatgtttatc 
tgacattggg gagtttagcg 
cacgttgcaa gacctgcctg 
tatggatgcg atcgctgcgg 
gcaaggaatc ggtcaataca 
tgtgtatcac tggcaaactg 
cgatgagctg atgctttggg 
tttcggctcc aacaatgtcc 
cgaggcgatg ttcggggatt 
gttggcttgt atggagcagc 
atcgccacga ctccgggcgt 
ggttgacggc aatttcgatg 
atccggagcc gggactgtcg 
cgatggctgt gtagaagtac 
ggcaaagaaa tagagtagat 
taataatgtg tgagtagttc 
gtgttgagca tataagaaac 
aaaatttcta attcctaaaa 
cggcgttaat tcagatcaag 
aaccctggcg ttacccaact 
aatagcgaag aggcccgcac 
tgctagagca gcttgagctt 
ctagactgaa ggcgggaaac 
atgacccccg ccgatgacgc 
cgttgaagga gccactcagc 
gaaaccatta ttgcgcgttc 
tgtcaaaaat gctccactga 
attcactctc aatccaaata 
atggtagatc tgactagtaa 



ttgctttgaa gacgtggttg 4620 
tccatctttg ggaccactgt 4680 
atgatggcat ttgtaggtgc 4740 
agctgggcaa tggaatccga 4 800 
agccctttgg tcttctgaga 4860 
ctccaccatg ttggcaagct 4920 
ccgattcatt aatgcagctg 4980 
aacgcaatta atgtgagtta 5040 
ccggctcgta tgttgtgtgg 5100 
gaccatgatt acgaattcga 5160 
tacattgatg agtttggaca 5220 
gaaatttgtg atgctattgc 5280 
gtgggcgaag aactccagca 5340 
gaaaacgatt ccgaagccca 5400 
cacgtgtcag tcctgctcct 5460 
ggcgaactcc cgcccccacg 5520 
ccggaagttc gtggacacga 5580 
ccacacccag gccagggtgt 5640 
ggtcacgtcg tcccggacca 5700 
gagccggtcg gtccagaact 5760 
ggcactggtc aacttggcca 5820 
ttcaatcctg caggaattcg 5880 
tctcagaaga ccaaagggct 5940 
tcggattcca ttgcccagct 6000 
gcacctacaa atgccatcat 6060 
acagtggtcc caaagatgga 6120 
caaccacgtc ttcaaagcaa 6180 
tctactccaa gaatatcaaa 6240 
aacaaagggt aatatcggga 6300 
tcaaaaggac agtagaaaag 6360 
aggctatcgt tcaagatgcc 6420 
ggagcatcgt ggaaaaagaa 6480 
atatctccac tgacgtaagg 6540 
tatataagga agttcatttc 6600 
aaatctatct ctctcgagct 6660 
aactcaccgc gacgtctgtc 6720 
tgatgcagct ctcggagggc 6780 
gatatgtcct, gcgggtaaat 6840 
ggcactttgc atcggccgcg 6900 
agagcctgac ctattgcatc 6960 
aaaccgaact gcccgctgtt 7020 
ccgatcttag ccagacgagc 7080 
ctacatggcg tgatttcata 7140 
tgatggacga caccgtcagt 7200 
ccgaggactg ccccgaagtc 72 60 
tgacggacaa tggccgcata 7320 
cccaatacga ggtcgccaac 7380 
agacgcgcta cttcgagcgg 7440 
atatgctccg cattggtctt 7500 
atgcagcttg ggcgcagggt 7560 
ggcgtacaca aatcgcccgc 7620 
tcgccgatag tggaaaccga 7680 
gccgaccgga tctgtcgatc 7740 
ccagataagg gaattagggt 7800 
ccttagtatg tatttgtatt 7860 
ccaaaatcca gtactaaaat 7920 
cttggcactg gccgtcgttt 7980 
taatcgcctt gcagcacatc 8040 
cgatcgccct tcccaacagt 8100 
ggatcagatt gtcgtttccc 8160 
gacaatctga tcatgagcgg 822 0 
gggacaagcc gttttacgtt 8280 
cgcgggtttc tggagtttaa 8340 
aaaagtcgcc taaggtcact 8400 
cgttccataa attcccctcg 8460 
atctgcaccg gatctcgaga 8520 
aggagaagaa cttttcactg 8580 
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gagttgtccc aattcttgtt gaattagatg 
gtggagaggg tgaaggtgat gcaacatacg 
ctggaaaact acctgttccg tggccaacac 
gcttttcaag atacccagat catatgaagc 
agggatacgt gcaggagagg accatcttct 
ctgaagtcaa gtttgaggga gacaccctcg 
tcaaggagga cggaaacatc ctcggccaca 
tatacatcat ggccgacaag caaaagaacg 
acatcgaaga cggcggcgtg caactcgctg 
atggccctgt ccttttacca gacaaccatt 
atcccaacga aaagagagac cacatggtcc 
cacatggcat ggatgaacta tacaaagcta 
gtgaccagct cgaatttccc cgatcgttca 
aatcctgttg ccggtcttgc gatgattatc 
gtaataatta acatgtaatg catgacgtta 
ccgcaattat acatttaata cgcgatagaa 
ttatcgcgcg cggtgtcatc tatgttacta 
ggatatattg gcgggtaaac ctaagagaaa 
aagggcgtga aaaggtttat ccgttcgtcc 
ccctcgggat caaagtactt tgatccaacc 
acgttcagtg cagccgtctt ctgaaaacga 
ggctgccgcc ctgccctttt cctggcgttt 
gaatacttgc gactagaacc ggagacatta 
gctgggctat gcccgcgtca gcaccgacga 
gcacgcggcc ggctgcacca agctgttttc 
cccggagctg gccaggatgc ttgaccacct 
gctagaccgc ctggcccgca gcacccgcga 
ggccggcgcg ggcctgcgta gcctggcaga 
ccgcatggtg ttgaccgtgt tcgccggcat 
ccgcacccgg agcgggcgcg aggccgccaa 
taccctcacc ccggcacaga tcgcgcacgc 
cgtgaaagag gcggctgcac tgcttggcgt 
gcgcagcgag gaagtgacgc ccaccgaggc 
attgaccgag gccgacgccc tggcggccgc 
aaaccgcacc aggacggcca ggacgaaccg 
atgatcgcgg ccgggtacgt gttcgagccg 
gaaatcctgg ccggtttgtc tgatgccaag 
gaagaaaccg agcgccgccg tctaaaaagg 
catgcggtcg ctgcgtatat gatgcgatga 
tgaaggttat cgctgtactt aaccagaaag 
atctagcccg cgccctgcaa ctcgccgggg 
agggcagtgc ccgcgattgg gcggccgtgc 
tcgaccgccc gacgattgac cgcgacgtga 
tcgacggagc gccccaggcg gcggacttgg 
tgctgafctcc ggtgcagcca agcccttacg 
tggttaagca gcgcattgag gtcacggatg 
gggcgatcaa aggcacgcgc atcggcggtg 
tgcccattct tgagtcccgt atcacgcagc 
gcacaaccgt tcttgaatca gaacccgagg 
ccgctgaaat taaatcaaaa ctcatttgag 
cacaaacacg ctaagtgccg gccgtccgag 
cagcctggca gacacgccag ccatgaagcg 
caccaagctg aagatgtacg cggtacgcca 
atacatcgcg cagctaccag agtaaatgag 
cggctaaagg aggcggcatg gaaaatcaag 
ccatgtgtgg aggaacgggc ggttggccag 
gcaatggcac tggaaccccc aagcccgagg 
ccggtacaaa tcggcgcggc gctgggtgat 
gccgcccagc ggcaacgcat cgaggcagaa 
gctgatcgaa tccgcaaaga atcccggcaa 
aagccgccca agggcgacga gcaaccagat 
acccgcgata gtcgcagcat catggacgtg 
cgagctggcg aggtgatccg ctacgagctt 
ccggccggca tggccagtgt gtgggattac 
accgaatcca tgaaccgata ccgggaaggg 
ccacacgttg cggacgtact caagttctgc 
gacctggtag aaacctgcat tcggttaaac 
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gtgatgttaa tgggcacaaa ttttctgtca 8640 
gaaaacttac ccttaaattt atttgcacta 8700 
ttgtcactac tttctcttat ggtgttcaat 8760 
ggcacgactt cttcaagagc gccatgcctg 882 0 
tcaaggacga cgggaactac aagacacgtg 8880 
tcaacaggat cgagcttaag ggaatcgatt 8940 
agttggaata caactacaac tcccacaacg 90 00 
gcatcaaagc caacttcaag acccgccaca 9060 
atcattatca acaaaatact ccaattggcg 9120 
acctgtccac acaatctgcc ctttcgaaag 9180 
ttcttgagtt tgtaacagct gctgggatta 9240 
gccaccacca ccaccaccac gtgtgaattg 93 00 
aacatttggc aataaagttt cttaagattg 9360 
atataatttc tgttgaatta cgttaagcat 9420 
tttatgagat gggtttttat gattagagtc 94 80 
aacaaaatat agcgcgcaaa ctaggataaa 9540 
gatcgggaat taaactatca gtgtttgaca 9600 
agagcgttta ttagaataac ggatatttaa 9660 
atttgtatgt gcatgccaac cacagggttc 9720 
cctccgctgc tatagtgcag tcggcttctg 9780 
catgtcgcac aagtcctaag ttacgcgaca 9840 
tcttgtcgcg tgttttagtc gcataaagta 9900 
cgccatgaac aagagcgccg ccgctggcct 9960 
ccaggacttg accaaccaac gggccgaact 10020 
cgagaagatc accggcacca ggcgcgaccg 10080 
acgccctggc gacgttgtga cagtgaccag 10140 
cctactggac attgccgagc gcatccagga 10200 
gccgtgggcc gacaccacca cgccggccgg 10260 
tgccgagttc gagcgttccc taatcatcga 10320 
ggcccgaggc gtgaagtttg gcccccgccc 10380 
ccgcgagctg atcgaccagg aaggccgcac 10440 
gcatcgctcg accctgtacc gcgcacttga 10500 
caggcggcgc ggtgccttcc gtgaggacgc 10560 
cgagaatgaa cgccaagagg aacaagcatg 1062 0 
tttttcatta ccgaagagat cgaggcggag 10680 
cccgcgcacg tctcaaccgt gcggctgcat 10740 
ctggcggcct ggccggccag cttggccgct 10800 
tgatgbgtat ttgagtaaaa cagcttgcgt 10860 
gtaaataaac aaatacgcaa ggggaacgca 10920 
gcgggtcagg caagacgacc atcgcaaccc 10980 
ccgatgttct gttagtcgat tccgatcccc 11040 
gggaagatca accgctaacc gttgtcggca 11100 
aggccatcgg ccggcgcgac ttcgtagtga 11160 
ctgtgtccgc gatcaaggca gccgacttcg 11220 
acatatgggc caccgccgac ctggtggagc 11280 
gaaggctaca agcggccttt gtcgtgtcgc 11340 
aggttgccga ggcgctggcc gggtacgagc 11400 
gcgtgagcta cccaggcact gccgccgccg 11460 
gcgacgctgc ccgcgaggtc caggcgctgg 11520 
ttaatgaggt aaagagaaaa tgagcaaaag 11580 
cgcacgcagc agcaaggctg caacgttggc 11640 
ggtcaacttt cagttgccgg cggaggatca 11700 
aggcaagacc attaccgagc tgctatctga 11760 
caaatgaata aatgagtaga tgaattttag 11820 
aacaaccagg caccgacgcc gtggaatgcc 11880 
gcgtaagcgg ctgggttgtc tgccggccct 11940 
aatcggcgtg acggtcgcaa accatccggc 12 000 
gacctggtgg agaagttgaa ggccgcgcag 12 060 
gcacgccccg gtgaatcgtg gcaagcggcc 12120 
ccgccggcag ccggtgcgcc gtcgattagg 12180 
tttttcgttc cgatgctcta tgacgtgggc 12240 
gccgttttcc gtctgtcgaa gcgtgaccga 12300 
ccagacgggc acgtagaggt ttccgcaggg 123 60 
gacctggtac tgatggcggt ttcccatcta 12420 
aagggagaca agcccggccg cgtgttccgt 12480 
cggcgagccg atggcggaaa gcagaaagac 12540 
accacgcacg ttgccatgca gc 12592 
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<210> 7 
<211> 3357 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pGEMEasyNOS Plasmid 
<400> 7 

tatcactagt gaattcgcgg ccgcctgcag 
tggatgcata gcttgagtat tctatagtgt 
tagctgtttc ctgtgtgaaa ttgttatccg 
agcataaagt gtaaagcctg gggtgcctaa 
cgctcactgc ccgctttcca gtcgggaaac 
caacgcgcgg ggagaggcgg tttgcgtatt 
tcgctgcgct cggtcgttcg gctgcggcga 
cggttatcca cagaatcagg ggataacgca 
aaggccagga accgtaaaaa ggccgcgttg 
gacgagcatc acaaaaatcg acgctcaagt 
agataccagg cgtttccccc tggaagctcc 
cttaccggat acctgtccgc ctttctccct 
cgctgtaggt atctcagttc ggtgtaggtc 
ccccccgttc agcccgaccg ctgcgcctta 
gtaagacacg acttatcgcc actggcagca 
tatgtaggcg gtgctacaga gttcttgaag 
acagtatttg gtatctgcgc tctgctgaag 
tcttgatccg gcaaacaaac caccgctggt 
attacgcgca gaaaaaaagg atctcaagaa 
gctcagtgga acgaaaactc acgttaaggg 
ttcacctaga tccttttaaa ttaaaaatga 
taaacttggt ctgacagtta ccaatgctta 
ctatttcgtt catccatagt tgcctgactc 
ggcttaccat ctggccccag tgctgcaatg 
gatttatcag caataaacca gccagccgga 
ttatccgcct ccatccagtc tattaattgt 
gttaatagtt tgcgcaacgt tgttgccatt 
tttggtatgg cttcattcag ctccggttcc 
atgttgtgca aaaaagcggt tagctccttc 
gccgcagtgt tatcactcat ggttatggca 
tccgtaagat gcttttctgt gactggtgag 
atgcggcgac cgagttgctc ttgcccggcg 
agaactttaa aagtgctcat cattggaaaa 
ttaccgctgt tgagatccag ttcgatgtaa 
tcttttactt tcaccagcgt ttctgggtga 
aagggaataa gggcgacacg gaaatgttga 
tgaagcattt atcagggtta ttgtctcatg 
aataaacaaa taggggttcc gcgcacattt 
aataccgcac agatgcgtaa ggagaaaata 
ttgttaaaat tcgcgttaaa tttttgttaa 
atcggcaaaa tcccttataa atcaaaagaa 
gtttggaaca agagtccact attaaagaac 
gtctatcagg gcgatggccc actacgtgaa 
aggtgccgta aagcactaaa tcggaaccct 
ggaaagccgg cgaacgtggc gagaaaggaa 
gcgctggcaa gtgtagcggt cacgctgcgc 
ccgctacagg gcgcgtccat tcgccattca 
tgcgggcctc ttcgctatta cgccagctgg 
gttgggtaac gccagggttt tcccagtcac 
aatacgactc actatagggc gaattgggcc 
gccgcgggaa ttcgattctc gagatccggt 
gactctaatt ggataccgag gggaatttat 
atatttgcta gctgatagtg accttaggcg 
gtatgtgctt agctcattaa actccagaaa 
ggttctgtca gttccaaacg taaaacggct 
tgactccctt aattctccgc tcatgatcag 
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gtcgaccata tgggagagct cccaacgcgt 60 
cacctaaata gcttggcgta atcatggtca 120 
ctcacaattc cacacaacat acgagccgga 180 
tgagtgagct aactcacatt aattgcgttg 240 
ctgtcgtgcc agctgcatta atgaatcggc 3 00 
gggcgctctt ccgcttcctc gctcactgac 360 
gcggtatcag ctcactcaaa ggcggtaata 420 
ggaaagaaca tgtgagcaaa aggccagcaa 480 
ctggcgtttt tccataggct ccgcccccct 540 
cagaggtggc gaaacccgac aggactataa 600 
ctcgtgcgct ctcctgttcc gaccctgccg 660 
tcgggaagcg tggcgctttc tcatagctca 720 
gttcgctcca agctgggctg tgtgcacgaa 780 
tccggtaact atcgtcttga gtccaacccg 840 
gccactggta acaggattag cagagcgagg 900 
tggtggccta actacggcta cactagaaga 960 
ccagttacct tcggaaaaag agttggtagc 1020 
agcggtggtt tttttgtttg caagcagcag 1080 
gatcctttga tcttttctac ggggtctgac 1140 
attttggtca tgagattatc aaaaaggatc 1200 
agttttaaat caatctaaag tatatatgag 1260 
atcagtgagg cacctatctc agcgatctgt 1320 
cccgtcgtgt agataactac gatacgggag 1380 
ataccgcgag acccacgctc accggctcca 1440 
agggccgagc gcagaagtgg tcctgcaact 1500 
tgccgggaag ctagagtaag tagttcgcca 1560 
gctacaggca tcgtggtgtc acgctcgtcg 1620 
caacgatcaa ggcgagttac atgatccccc 1680 
ggtcctccga tcgttgtcag aagtaagttg 1740 
gcactgcata attctcttac tgtcatgcca 1800 
tactcaacca agtcattctg agaatagtgt 1860 
tcaatacggg ataataccgc gccacatagc 1920 
cgttcttcgg ggcgaaaact ctcaaggatc 1980 
cccactcgtg cacccaactg atcttcagca 2040 
gcaaaaacag gaaggcaaaa tgccgcaaaa 2100 
atactcatac tcttcctttt tcaatattat 2160 
agcggataca tatttgaatg tatttagaaa 2220 
ccccgaaaag tgccacctga tgcggtgtga 22 80 
ccgcatcagg aaattgtaag cgttaatatt 2340 
atcagctcat tttttaacca ataggccgaa 2400 
tagaccgaga tagggttgag tgttgttcca 2460 
gtggactcca acgtcaaagg gcgaaaaacc 252 0 
ccatcaccct aatcaagttt tttggggtcg 2580 
aaagggagcc cccgatttag agcttgacgg 2640 
gggaagaaag cgaaaggagc gggcgctagg 2700 
gtaaccacca cacccgccgc gcttaatgcg 2760 
ggctgcgcaa ctgttgggaa gggcgatcgg 2820 
cgaaaggggg atgtgctgca aggcgattaa 2880 
gacgttgtaa aacgacggcc agtgaattgt 2940 
cgacgtcgca tgctcccggc cgccatggcg 3000 
gcagattatt tggattgaga gtgaatatga 3060 
ggaacgtcag tggagcattt ttgacaagaa 3120 
acttttgaac gcgcaataat ggtttctgac 3180 
cccgcggctg agtggctcct tcaacgttgc 3240 
tgtcccgcgt catcggcggg ggtcataacg 3300 
attgtcgttt cccgccttca gtctaga 3357 
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<211> 10122 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pl302NOS Plasmid 
<400> 8 

catggtagat ctgactagta aaggagaaga acttttcact ggagttgtcc caattcttgt 60 
tgaattagat ggtgatgtta atgggcacaa attttctgtc agtggagagg gtgaaggtga 12 0 
tgcaacatac ggaaaactta cccttaaatt tatttgcact actggaaaac tacctgttcc 180 
gtggccaaca cttgtcacta ctttctctta tggtgttcaa tgcttttcaa gatacccaga 240 
tcatatgaag cggcacgact tcttcaagag cgccatgcct gagggatacg tgcaggagag 300 
gaccatcttc ttcaaggacg acgggaacta caagacacgt gctgaagtca agtttgaggg 360 
agacaccctc gtcaacagga tcgagcttaa gggaatcgat ttcaaggagg acggaaacat 420 
cctcggccac aagttggaat acaactacaa ctcccacaac gtatacatca tggccgacaa 480 
gcaaaagaac ggcatcaaag ccaacttcaa gacccgccac aacatcgaag acggcggcgt 540 
gcaactcgct gatcattatc aacaaaatac tccaattggc gatggccctg tccttttacc 600 
agacaaccat tacctgtcca cacaatctgc cctttcgaaa gatcccaacg aaaagagaga 660 
ccacatggtc cttcttgagt ttgtaacagc tgctgggatt acacatggca tggatgaact 720 
atacaaagct agccaccacc accaccacca cgtgtgaatt ggtgaccagc tcgaatttcc 780 
ccgatcgttc aaacatttgg caataaagtt tcttaagatt gaatcctgtt gccggtcttg 840 
cgatgattat catataattt ctgttgaatt acgttaagca tgtaataatt aacatgtaat 900 
gcatgacgtt atttatgaga tgggttttta tgattagagt cccgcaatta tacatttaat 960 
acgcgataga aaacaaaata tagcgcgcaa actaggataa attatcgcgc gcggtgtcat 1020 
ctatgttact agatcgggaa ttaaactatc agtgtttgac aggatatatt ggcgggtaaa 1080 
cctaagagaa aagagcgttt attagaataa cggatattta aaagggcgtg aaaaggttta 1140 
tccgttcgtc catttgtatg tgcatgccaa ccacagggtt cccctcggga tcaaagtact 1200 
ttgatccaac ccctccgctg ctatagtgca gtcggcttct gacgttcagt gcagccgtct 1260 
tctgaaaacg acatgtcgca caagtcctaa gttacgcgac aggctgccgc cctgcccttt 1320 
tcctggcgtt ttcttgtcgc gtgttttagt cgcataaagt agaatacttg cgactagaac 13 80 
cggagacatt acgccatgaa caagagcgcc gccgctggcc tgctgggcta* tgcccgcgtc 1440 
agcaccgacg accaggactt gaccaaccaa cgggccgaac tgcacgcggc cggctgcacc 1500 
aagctgtttt ccgagaagat caccggcacc aggcgcgacc gcccggagct ggccaggatg 1560 
cttgaccacc tacgccctgg cgacgttgtg acagtgacca ggctagaccg cctggcccgc 1620 
agcacccgcg acctactgga cattgccgag cgcatccagg aggccggcgc gggcctgcgt 1680 
agcctggcag agccgtgggc cgacaccacc acgccggccg gccgcatggt gttgaccgtg 1740 
ttcgccggca ttgccgagtt cgagcgttcc ctaatcatcg accgcacccg gagcgggcgc 1800 
gaggccgcca aggcccgagg cgtgaagttt ggcccccgcc ctaccctcac cccggcacag 1860 
atcgcgcacg cccgcgagct gatcgaccag gaaggccgca ccgtgaaaga ggcggctgca 1920 
ctgcttggcg tgcatcgctc gaccctgtac cgcgcacttg agcgcagcga ggaagtgacg 1980 
cccaccgagg ccaggcggcg cggtgccttc cgtgaggacg cattgaccga ggccgacgcc 2040 
ctggcggccg ccgagaatga acgccaagag gaacaagcat gaaaccgcac caggacggcc 2100 
aggacgaacc gtttttcatt accgaagaga tcgaggcgga gatgatcgcg gccgggtacg 2160 
tgttcgagcc gcccgcgcac gtctcaaccg tgcggctgca tgaaatcctg gccggtttgt 2220 
ctgatgccaa gctggcggcc tggccggcca gcttggccgc tgaagaaacc gagcgccgcc 22 80 
gtctaaaaag gtgatgtgta tttgagtaaa acagcttgcg tcatgcggtc gctgcgtata 2340 
tgatgcgatg agtaaataaa caaatacgca aggggaacgc atgaaggtta tcgctgtact 2400 
taaccagaaa ggcgggtcag gcaagacgac catcgcaacc catctagccc gcgccctgca 24 60 
actcgccggg gccgatgttc tgttagtcga ttccgatccc cagggcagtg cccgcgattg 2520 
ggcggccgtg cgggaagatc aaccgctaac cgttgtcggc atcgaccgcc cgacgattga 2580 
ccgcgacgtg aaggccatcg gccggcgcga cttcgtagtg atcgacggag cgccccaggc 2640 
ggcggacttg gctgtgtccg cgatcaaggc agccgacttc gtgctgattc cggtgcagcc 2700 
aagcccttac gacatatggg ccaccgccga cctggtggag ctggttaagc agcgcattga 2760 
ggtcacggat ggaaggctac aagcggcctt tgtcgtgtcg cgggcgatca aaggcacgcg 2820 
catcggcggt gaggttgccg aggcgctggc cgggtacgag ctgcccattc ttgagtcccg 2880 
tatcacgcag cgcgtgagct acccaggcac tgccgccgcc ggcacaaccg ttcttgaatc 2940 
agaacccgag ggcgacgctg cccgcgaggt ccaggcgctg gccgctgaaa ttaaatcaaa 3000 
actcatttga gttaatgagg taaagagaaa atgagcaaaa gcacaaacac gctaagtgcc 3060 
ggccgtccga gcgcacgcag cagcaaggct gcaacgttgg ccagcctggc agacacgcca 3120 
gccatgaagc gggtcaactt tcagttgccg gcggaggatc acaccaagct gaagatgtac 3180 
gcggtacgcc aaggcaagac cattaccgag ctgctatctg aatacatcgc gcagctacca 3240 
gagtaaatga gcaaatgaat aaatgagtag atgaatttta gcggctaaag gaggcggcat 33 00 
ggaaaatcaa gaacaaccag gcaccgacgc cgtggaatgc cccatgtgtg gaggaacggg 3360 
cggttggcca ggcgtaagcg gctgggttgt ctgccggccc tgcaatggca ctggaacccc 3420 
caagcccgag gaatcggcgt gacggtcgca aaccatccgg cccggtacaa atcggcgcgg 3480 
cgctgggtga tgacctggtg gagaagttga aggccgcgca ggccgcccag cggcaacgca 3540 
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tcgaggcaga agcacgcccc ggtgaatcgt ggcaagcggc cgctgatcga atccgcaaag 3600 
aatcccggca accgccggca gccggtgcgc cgtcgattag gaagccgccc aagggcgacg 3660 
agcaaccaga ttttttcgtt ccgatgctct atgacgtggg cacccgcgat agtcgcagca 3720 
tcatggacgt ggccgttttc cgtctgtcga agcgtgaccg acgagctggc gaggtgatcc 3780 
gctacgagct tccagacggg cacgtagagg tttccgcagg gccggccggc atggccagtg 3 840 
tgtgggatta cgacctggta ctgatggcgg tttcccatct aaccgaatcc atgaaccgat 3900 
accgggaagg gaagggagac aagcccggcc gcgtgttccg tccacacgtt gcggacgtac 3960 
tcaagttctg ccggcgagcc gatggcggaa agcagaaaga cgacctggta gaaacctgca 4020 
ttcggttaaa caccacgcac gttgccatgc agcgtacgaa gaaggccaag aacggccgcc 40 80 
tggtgacggt atccgagggt gaagccttga ttagccgcta caagatcgta aagagcgaaa 4140 
ccgggcggcc ggagtacatc gagatcgagc tagctgattg gatgtaccgc gagatcacag 42 00 
aaggcaagaa cccggacgtg ctgacggttc accccgatta ctttttgatc gatcccggca 4260 
tcggccgttt tctctaccgc ctggcacgcc gcgccgcagg caaggcagaa gccagatggt 4320 
tgttcaagac gatctacgaa cgcagtggca gcgccggaga gttcaagaag ttctgtttca 43 80 
ccgtgcgcaa gctgatcggg tcaaatgacc tgccggagta cgatttgaag gaggaggcgg 4440 
ggcaggctgg cccgatccta gtcatgcgct accgcaacct gatcgagggc gaagcatccg 4500 
ccggttccta atgtacggag cagatgctag ggcaaattgc cctagcaggg gaaaaaggtc 4560 
gaaaaggtct ctttcctgtg gatagcacgt acattgggaa cccaaagccg tacattggga 4620 
accggaaccc gtacattggg aacccaaagc cgtacattgg gaaccggtca cacatgtaag 4680 
tgactgatat aaaagagaaa aaaggcgatt tttccgccta aaactcttta aaacttatta 4740 
aaactcttaa aacccgcctg gcctgtgcat aactgtctgg ccagcgcaca gccgaagagc 4800 
tgcaaaaagc gcctaccctt cggtcgctgc gctccctacg ccccgccgct tcgcgtcggc 4 860 
ctatcgcggc cgctggccgc tcaaaaatgg ctggcctacg gccaggcaat ctaccagggc 4920 
gcggacaagc cgcgccgtcg ccactcgacc gccggcgccc acatcaaggc accctgcctc 4980 
gcgcgtttcg gtgatgacgg tgaaaacctc tgacacatgc agctcccgga gacggtcaca 5040 
gcttgtctgt aagcggatgc cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt 5100 
ggcgggtgtc ggggcgcagc catgacccag tcacgtagcg atagcggagt gtatactggc 5160 
ttaactatgc ggcatcagag cagattgtac tgagagtgca ccatatgcgg tgtgaaatac 5220 
cgcacagatg cgtaaggaga aaataccgca tcaggcgctc ttccgcttcc tcgctcactg 5280 
actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa 5340 
tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc 5400 
aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc 5460 
ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat 5520 
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc 5580 
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct 5640 
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg 5700 
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc 5760 
cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga 5820 
ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa 5880 
ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta 5940 
gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc 6000 
agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg 6060 
acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgcattct aggtactaaa 6120 
acaattcatc cagtaaaata taatatttta ttttctccca atcaggcttg atccccagta 6180 
agtcaaaaaa tagctcgaca tactgttctt ccccgatatc ctccctgatc gaccggacgc 6240 
agaaggcaat gtcataccac ttgtccgccc tgccgcttct cccaagatca ataaagccac 63 00 
ttactttgcc atctttcaca aagatgttgc tgtctcccag gtcgccgtgg gaaaagacaa 63 60 
gttcctcttc gggcttttcc gtctttaaaa aatcatacag ctcgcgcgga tctttaaatg 6420 
gagtgtcttc ttcccagttt tcgcaatcca catcggccag atcgttattc agtaagtaat 6480 
ccaattcggc taagcggctg tctaagctat tcgtataggg acaatccgat atgtcgatgg 6540 
agtgaaagag cctgatgcac tccgcataca gctcgataat cttttcaggg ctttgttcat 6600 
cttcatactc ttccgagcaa aggacgccat cggcctcact catgagcaga ttgctccagc 6660 
catcatgccg ttcaaagtgc aggacctttg gaacaggcag ctttccttcc agccatagca 6720 
tcatgtcctt ttcccgttcc acatcatagg tggtcccttt ataccggctg tccgtcattt 6780 
ttaaatatag gttttcattt tctcccacca gcttatatac cttagcagga gacattcctt 6840 
ccgtatcttt tacgcagcgg tatttttcga tcagtttttt caattccggt gatattctca 6900 
ttttagccat ttattatttc cttcctcttt tctacagtat ttaaagatac cccaagaagc 6960 
taattataac aagacgaact ccaattcact gttccttgca ttctaaaacc ttaaatacca 7020 
gaaaacagct ttttcaaagt tgttttcaaa gttggcgtat aacatagtat cgacggagcc 7080 
gattttgaaa ccgcggtgat cacaggcagc aacgctctgt catcgttaca atcaacatgc 7140 
taccctccgc gagatcatcc gtgtttcaaa cccggcagct tagttgccgt tcttccgaat 72 00 
agcatcggta acatgagcaa agtctgccgc cttacaacgg ctctcccgct gacgccgtcc 7260 
cggactgatg ggctgcctgt atcgagtggt gattttgtgc cgagctgccg gtcggggagc 7320 
tgttggctgg ctggtggcag gatatattgt ggtgtaaaca aattgacgct tagacaactt 73 80 
aataacacat tgcggacgtt tttaatgtac tgaattaacg ccgaattaat tcgggggatc 7440 
tggattttag tactggattt tggttttagg aattagaaat tttattgata gaagtatttt 7500 
acaaatacaa atacatacta agggtttctt atatgctcaa cacatgagcg aaaccctata 7560 
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ggaaccctaa ttcccttatc tgggaactac tcacacatta ttatggagaa actcgagctt 7620 
gtcgatcgac agate egg tc ggc ate tact ctatttcttt gccctcggac gagtgctggg 7680 
gcgtcggttt ccactatcgg cgagtacttc tacacagcca teggtccaga cggccgcgct 7740 
tetgegggeg atttgtgtac gcccgacagt cccggctccg gateggaega ttgegtcgea 7800 
tcgaccctgc gcccaagctg catcatcgaa attgeegtea accaagctct gatagagttg 7860 
gtcaagacca atgeggagea tatacgcccg gagtcgtggc gatcctgeaa getceggatg 7920 
cctccgctcg aagtagcgcg tctgctgctc catacaagcc aaccacggcc tccagaagaa 7980 
gatgttggcg acctegtatt gggaatcccc gaacatcgcc tcgctccagt caatgaccgc 8040 
tgttatgcgg ccattgtccg tcaggacatt gttggagccg aaatccgcgt gcacgaggtg 8100 
ccggacttcg gggcagtcet cggcccaaag cat cage tea tegagagect gcgcgacgga 8160 
cgcactgacg gtgtcgtcca tcacagtttg ccagtgatac acatggggat cagcaatcgc 8220 
gcatatgaaa tcacgccatg tagtgtattg accgattcct tgeggtcega atgggccgaa 82 80 
cccgctcgtc tggctaagat cggccgcagc gatcgcatcc atagectccg cgaccggttg 8340 
tagaacagcg ggcagttcgg tttcaggcag gtcttgeaac gtgacaccct gtgcacggcg 8400 
ggagatgcaa taggtcaggc tetegctaaa ctccccaatg tcaagcactt ceggaategg 8460 
gagcgcggcc gatgeaaagt gecgataaac ataacgatct ttgtagaaac catcggcgca 8520 
gctatttacc cgcaggacat atccacgccc tcctacatcg aagctgaaag cacgagattc 8580 
ttcgccctcc gagagctgea teaggtegga gaegctgteg aacttttcga tcagaaactt 8640 
ctcgacagac gtcgcggtga gttcaggctt tttcatatct cattgccccc ccggatctgc 8700 
gaaagctcga gagagataga tttgtagaga gagactggtg atttcagegt gtcctctcca 8760 
aatgaaatga acttccttat atagaggaag gtcttgcgaa ggatagtggg attgtgcgtc 8820 
atcccttacg tcagtggaga tatcacatca atccacttgc tttgaagacg tggttggaac 8880 
gtcttctttt tccacgatgc tcctcgtggg tgggggtcca tctttgggac cactgtcggc 8940 
agaggcatct tgaacgatag cctttccttt ategcaatga tggcatttgt aggtgccacc 9000 
ttccttttct actgtccttt tgatgaagtg acagatagct gggcaatgga atccgaggag 9060 
gtttcccgat attacccttt gttgaaaagt ctcaatagcc ctttggtctt ctgagactgt 9120 
atctttgata ttcttggagt agacgagagt gtcgtgctcc accatgttat cacatcaatc 9180 
cacttgettt gaagacgtgg ttggaacgtc ttctttttcc acgatgctcc tcgtgggtgg 9240 
gggtccatct ttgggaccac tgteggcaga ggcatcttga aegatagect ttcctttatc 9300 
gcaatgatgg catttgtagg tgccaccttc cttttctact gtccttttga tgaagtgaca 9360 
gatagctggg caatggaatc cgaggaggtt tcccgatatt accctttgtt gaaaagtctc 9420 
aatagecett tggtcttctg agactgtatc tttgatattc ttggagtaga cgagagtgtc 9480 
gtgctccacc atgttggcaa getgetctag ccaatacgca aaccgcctct ccccgcgcgt 9540 
tggecgatte attaatgeag ctggcacgac aggtttcccg actggaaagc gggcagtgag 9600 
cgcaacgcaa ttaatgtgag ttagctcact cattaggcac cccaggcttt acactttatg 9660 
cttccggctc gtatgttgtg tggaattgtg ageggataac aatttcacac aggaaacagc 9720 
tatgaccatg attacgaatt egagcteggt acceggggat cctctagact gaaggcggga 9780 
aacgacaatc tgatcatgag eggagaatta agggagtcac gttatgaccc ccgccgatga 9840 
cgcgggacaa geegttttae gtttggaact gacagaaccg caacgttgaa ggagccactc 9900 
agecgegggt ttctggagtt taatgagcta agcacatacg tcagaaacca ttattgcgcg 9960 
ttcaaaagtc gectaaggtc actatcagct agcaaatatt tcttgtcaaa aatgctccac 10020 
tgacgttcca taaattcccc teggtatcca attagagtct catattcact ctcaatccaa 10080 
ataatctgea ccggatctcg agaatcgaat tcccgcggcc gc 10122 

<210> 9 
<211> 621 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> N . tabacum rDNA intergnic spacer (IGS) sequence 
<300> 

<308> Genbank #Y08422 
<309> 1997-10-31 



<400> 9 

gtgetageca atgtttaaca agatgtcaag 
gctggcggtg gtggaaaatt gcggtggttc 
tgcagcggtg tttgatatcg gaatcactta 
gttattggtg gttggtcatc tatatatttt 
ttacatattt tttattaaat ttatgcattg 
tgttttataa aatattttat tattttatgt 
ttctccattg ttttttctat atttataata 
attttttcgt tttataataa atatttatta 
tttacaatgt ttaaaagtca tttgtgaata 
tttggtgttg tacatgtcta ttatgattct 



cacaatgaat gttggtggtt ggtggtcgtg 60 
gageggtagt gateggegat ggttggtgtt 120 
tggtggttgt cacaatggag gtgcgtcatg 180 
tataataata ttaagtattt tacctatttt 240 
tttgtatttt taaatagttt ttategtact 300 
gttatattat tacttgatgt attggaaatt 360 
attttcttat ttttttttgt tttattatgt 420 
aaaaaaatat tatttttgta aaatatatca 480 
tattagctaa gttgtacttc tttttgtgca 540 
ctggccaaaa catgtctact cctgtcactt 600 
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gggttttttt ttttaagaca t 621 

<210> 10 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer NTIGS-F1 
<400> 10 

gtgctagcca atgtttaaca agatg 25 

<210> 11 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer NTIGS-RI 
<400> 11 

atgtcttaaa aaaaaaaacc caagtgac 28 

<210> 12 

<211> 233 

<212> DNA 

<213> Mus musculuB 

<300> 

<308> Genbank #V00846 
<309> 1989-07-06 

<400> 12 

gacctggaat atggcgagaa aactgaaaat cacggaaaat gagaaataca cactttagga 60 

cgtgaaatat ggcgaggaaa actgaaaaag gtggaaaatt tagaaatgtc cactgtagga 120 

cgtggaatat ggcaagaaaa ctgaaaatca tggaaaatga gaaacatcca cttgacgact 180 

tgaaaaatga cgaaatcact aaaaaacgtg aaaaatgaga aatgcacact gaa 233 

<210> 13 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer MSAT-P1 
<400> 13 

aataccgcgg aagcttgacc tggaatatcg c 31 

<210> 14 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer MSAT-RI 
<400> 14 

ataaccgcgg agtccttcag tgtgcat 27 

<210> 15 
<211> 277 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Nopaline Synthase Promoter Fragment 
<300> 

<308> Genebank #U09365 
<309> 1997-10-17 

<400> .15 

gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc 60 
tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat 120 
aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca 180 
attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc 240 
gcgcgcggtg tcatctatgt tactagatcg ggaattc 277 

<210> 16 
<211> 1812 
<212> DNA 

<213> Escherichia coli 

<220> 
<221> CDS 

<222> (1) . . . (1812) 

<223> Beta-glucuronidase 

<300> 

<308> Genbank #S69414 
<309> 1994-09-23 

<400> 16 

atg tta cgt cct gta gaa acc cca acc cgt gaa ate aaa aaa etc gac 48 
Met Leu Arg Pro Val Glu Thr Pro Thr Arg Glu lie Lys Lys Leu Asp 
15 10 15 

ggc ctg tgg gca ttc agt ctg gat cgc gaa aac tgt gga att gat cag 96 
Gly Leu Trp Ala Phe Ser Leu Asp Arg Glu Asn Cys Gly He Asp Gin 
20 25 30 

cgt tgg tgg gaa age gcg tta caa gaa age egg gca att get gtg cca 144 
Arg Trp Trp Glu Ser Ala Leu Gin Glu Ser Arg Ala He Ala Val Pro 
35 40 45 

ggc agt ttt aac gat cag ttc gec gat gca gat att cgt aat tat gcg 192 
Gly Ser Phe Asn Asp Gin Phe Ala Asp Ala Asp He Arg Asn Tyr Ala 
50 55 60 

ggc aac gtc tgg tat cag cgc gaa gtc ttt ata ccg aaa ggt tgg gca 240 
Gly Asn Val Trp Tyr Gin Arg Glu Val Phe He Pro Lys Gly Trp Ala 
65 70 75 80 

ggc cag cgt ate gtg ctg cgt ttc gat gcg gtc act cat tac ggc aaa 288 
Gly Gin Arg He Val Leu Arg Phe Asp Ala Val Thr His' Tyr Gly Lys 
85 90 95 

gtg tgg gtc aat aat cag gaa gtg atg gag cat cag ggc ggc tat acg 336 
Val Trp Val Asn Asn Gin Glu Val Met Glu His Gin Gly Gly Tyr Thr 
100 105 110 

cca ttt gaa gee gat gtc acg ccg tat gtt att gec ggg aaa agt gta 384 
Pro Phe Glu Ala Asp Val Thr Pro Tyr Val He Ala Gly Lys Ser Val 
115 120 125 

cgt ate acc gtt tgt gtg aac aac gaa ctg aac tgg cag act ate ccg 432 
Arg He Thr Val Cys Val Asn Asn Glu Leu Asn Trp Gin Thr He Pro 
130 135 140 

ccg gga atg gtg att acc gac gaa aac ggc aag aaa aag cag tct tac 4 80 
Pro Gly Met Val He Thr Asp Glu Asn Gly Lys Lys Lys Gin Ser Tyr 
145 150 155 160 
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ttc cat gat ttc ttt aac tat gcc gga ate cat cgc age gta atg etc 528 
Phe His Asp Phe Phe Asn Tyr Ala Gly He His Arg Ser Val Met Leu 
165 170 175 

tac acc acg ccg aac acc tgg gtg gac gat ate acc gtg gtg acg cat 576 
Tyr Thr Thr Pro Asn Thr Trp Val Asp Asp He Thr Val Val Thr His 
180 185 190 

gtc gcg caa gac tgt aac cac gcg tct gtt gac tgg cag gtg gtg gcc 624 
Val Ala Gin Asp Cys Asn His Ala Ser Val Asp Trp Gin Val Val Ala 
195 200 205 

aat ggt gat gtc age gtt gaa ctg cgt gat gcg gat caa cag gtg gtt 672 
Asn Gly Asp Val Ser Val Glu Leu Arg Asp Ala Asp Gin Gin Val Val 
210 215 220 

gca act gga caa ggc act age ggg act ttg caa gtg gtg aat ccg cac 72 0 
Ala Thr Gly Gin Gly Thr Ser Gly Thr Leu Gin Val Val Asn Pro His 
225 230 235 240 

etc tgg caa ccg ggt gaa ggt tat etc tat gaa ctg tgc gtc aca gcc 768 
Leu Trp Gin Pro Gly Glu Gly Tyr Leu Tyr Glu Leu Cys Val Thr Ala 
245 250 255 

aaa age cag aca gag tgt gat ate tac ccg ctt cgc gtc ggc ate egg 816 
Lys Ser Gin Thr Glu Cys Asp lie Tyr Pro Leu Arg Val Gly lie Arg 
260 265 270 

tea gtg gca gtg aag ggc gaa cag ttc ctg att aac cac aaa ccg ttc 864 
Ser Val Ala Val Lys Gly Glu Gin Phe Leu He Asn His Lys Pro Phe 
275 280 285 

tac ttt act ggc ttt ggt cgt cat gaa gat gcg gac ttg cgt ggc aaa 912 
Tyr Phe Thr Gly Phe Gly Arg His Glu Asp Ala Asp Leu Arg Gly Lys 
290 295 300 



gga ttc gat aac gtg ctg atg gtg cac gac cac gca tta atg gac tgg 
Gly Phe Asp Asn Val Leu Met Val His Asp His Ala Leu Met Asp Trp 
305 310 315 320 



960 



att ggg gcc aac tec tac cgt acc teg cat tac cct tac get gaa gag 1008 

He Gly Ala Asn Ser Tyr Arg Thr Ser His Tyr Pro Tyr Ala Glu Glu 
325 330 335 

atg etc gac tgg gca gat gaa cat ggc ate gtg gtg att gat gaa act 1056 

Met Leu Asp Trp Ala Asp Glu His Gly He Val Val He Asp Glu Thr 
340 345 350 

get get gtc ggc ttt aac etc tct tta ggc att ggt ttc gaa gcg ggc 1104 

Ala Ala Val Gly Phe Asn Leu Ser Leu Gly He Gly Phe Glu Ala Gly 
355 360 365 

aac aag ccg aaa gaa ctg tac age gaa gag gca gtc aac ggg gaa act 1152 

Asn Lys Pro Lys Glu Leu Tyr Ser Glu Glu Ala Val Asn Gly Glu Thr 
370 375 380 

cag caa gcg cac tta cag gcg att aaa gag ctg at a gcg cgt gac aaa 1200 

Gin Gin Ala His Leu Gin Ala He Lys Glu Leu He Ala Arg Asp Lys 

385 390 395 400 

aac cac cca age gtg gtg atg tgg agt att gcc aac gaa ccg gat acc 1248 

Asn His Pro Ser Val Val Met Trp Ser He Ala Asn Glu Pro Asp Thr 
405 410 415 

cgt ccg caa ggt gca egg gaa tat ttc gcg cca ctg gcg gaa gca acg 1296 

Arg Pro Gin Gly Ala Arg Glu Tyr Phe Ala Pro Leu Ala Glu Ala Thr 
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420 425 430 

cgt aaa etc gac ccg acg cgt ccg ate acc tgc gtc aat gta atg ttc 1344 
Arg Lys Leu Asp Pro Thr Arg Pro lie Thr Cys Val Asn Val Met Phe 
435 440 445 

tgc gac get cac acc gat acc ate age gat etc ttt gat gtg ctg tgc 1392 
Cys Asp Ala His Thr Asp Thr He Ser Asp Leu Phe Asp Val Leu Cys 
450 455 460 

ctg aac cgt tat tac gga tgg tat gtc caa age ggc gat ttg gaa acg 1440 
Leu Asn Arg Tyr Tyr Gly Trp Tyr Val Gin Ser Gly Asp Leu Glu Thr 
465 470 ' 475 480 



gca gag aag gta ctg gaa aaa gaa ctt ctg gec tgg cag gag aaa ctg 
Ala Glu Lys Val Leu Glu Lys Glu Leu Leu Ala Trp Gin Glu Lys Leu 
485 490 495 



ggt gaa cag gta tgg aat ttc gec gat ttt gcg acc teg caa ggc ata 
Gly Glu Gin Val Trp Asn Phe Ala Asp Phe Ala Thr Ser Gin Gly He 
545 550 555 560 



ttc ggt gaa aaa ccg cag cag gga ggc aaa caa tga 
Phe Gly Glu Lys Pro Gin Gin Gly Gly Lys Gin * 
595 600 

<210> 17 
<211> 603 
<212> PRT 

<213> Escherichia coli 
<300> 

<308> Genbank #S69414 
<309> 1994-09-23 

<400> 17 

Met Leu Arg Pro Val Glu Thr Pro Thr Arg Glu He Lys Lys Leu Asp 

15 10 15 

Gly Leu Trp Ala Phe Ser Leu Asp Arg Glu Asn Cys Gly He Asp Gin 

20 25 30 

Arg Trp Trp Glu Ser Ala Leu Gin Glu Ser Arg Ala He Ala Val Pro 

35 40 45 

Gly Ser Phe Asn Asp Gin Phe Ala Asp Ala Asp He Arg Asn Tyr Ala 

50 55 60 

Gly Asn Val Trp Tyr Gin Arg Glu Val Phe He Pro Lys Gly Trp Ala 
65 70 75 80 



1488 



cat cag ccg att ate ate acc gaa tac ggc gtg gat acg tta gec ggg 1536 

His Gin Pro He He He Thr Glu Tyr Gly Val Asp Thr Leu Ala Gly 
500 505 510 

ctg cac tea atg tac acc gac atg tgg agt gaa gag tat cag tgt gca 1584 

Leu His Ser Met Tyr Thr Asp Met Trp Ser Glu Glu Tyr Gin Cys Ala 
515 520 525 

tgg ctg gat atg tat cac cgc gtc ttt gat cgc gtc age gec gtc gtc 1632 

Trp Leu Asp Met Tyr His Arg Val Phe Asp Arg Val Ser Ala Val Val 
530 535 540 



1680 



ttg cgc gtt ggc ggt aac aag aaa ggg ate ttc act cgc gac cgc aaa 1728 
Leu Arg Val Gly Gly Asn Lys Lys Gly He Phe Thr Arg Asp Arg Lys 
565 570 575 

ccg aag teg gcg get ttt ctg ctg caa aaa cgc tgg act ggc atg aac 1776 
Pro Lys Ser Ala Ala Phe Leu Leu Gin Lys Arg Trp Thr Gly Met Asn 
580 585 590 



1812 
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Ala 


His 


450 






Leu Asn 


Arg 


Tyr 


465 






Ala GlU 


Lys 


vax 


His Gin 


Pro 


He 






500 


Leu His 


Ser 


Met 




515 




Trp Leu 


Asp 


Met 


530 






Gly Glu 


Gin 


Val 


545 






Leu Arg 


Val 


Gly 


Pro Lys 


Ser 


Ala 




580 


Phe Gly 


Glu 


Lys 




595 





Val 


Leu 


Arg 


Phe 


85 








Asn 


Gin 


Glu 


TT— "1 

val 


Asp 


Val 


Thr 


Pro 






120 


Cys 


Val 


Asn 


Asn 






135 




He 


Thr 


Asp 


Glu 




150 






Phe 


Asn 


Tyr 


Ala 


165 








Asn 


Tnr 


Trp 


vax 


Cys 


Asn 


His 


Ala 






200 


Ser 


Val 


Glu 


Leu 






215 




Gly 


Thr 


Ser 


Gly 




230 






Gly 


Glu 


Gly 


Tyr 


245 








GlU 


Cys 


Asp 


lie 


Lys 


Gly 


Glu 


Gin 








280 


Phe 


Gly 


Arg 


His 






295 




Val 


Leu 


Met 


Val 




310 






Ser 


Tyr 


Arg 


Thr 


325 








Ala 


ASp 


bXU 


TT <! n 
HIS 


Phe 


Asn 


Leu 


Ser 








360 


Glu 


Leu 


Tyr 


Ser 






375 




Leu 


Gin 


Ala 


He 




390 






Val 


Val 


Met 


Trp 


405 








Ala 


Arg 


Glu 


Tyr 


Pro 


Thr 


Arg 


Pro 








440 


Thr 


Asp 


Thr 


He 






455 




Tyr 


Gly 


Trp 


Tyr 




470 






Leu 


Glu 


Lys 


Glu 


485 






He 


He 


Thr 


Glu 


Tyr 


Thr 


Asp 


Met 








520 


Tyr 


His 


Arg 


Val 






535 




Trp 


Asn 


Phe 


Ala 




550 






Gly 


Asn 


Lys 


Lys 


565 








Ala 


Phe 


Leu 


Leu 


Pro 


Gin 


Gin 


Gly 








600 
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Asp 


Ala 


Val 


Thr 




90 






Met 


Glu 


His 


Gin 


105 








Tyr 


vax 


xxe 


ax a 


Glu 


Leu 


Asn 


Trp 








140 


Asn 


Gly 


Lys 


Lys 






155 




Gly 


He 


His 


Arg 




170 






Asp 


Asp 


He 


Thr 


185 








Ser 


vax 


Asp 


Trp 


Arg 


Asp 


Ala 


Asp 








220 


Thr 


Leu 


Gin 


Val 






235 




Leu 


Tyr 


Glu 


Leu 




250 






Tyr 


Pro 


Leu 


Arg 


265 








Phe 


Leu 


xxe 


Asn 


Glu 


Asp 


Ala 


Asp 








300 


His 


Asp 


His 


Ala 






315 




Ser 


His 


Tyr 


Pro 




330 






Gly 


He 


Val 


Val 


345 








Leu 




Ti- 
ne 


VaXy 


Glu 


Glu 


Ala 


Val 








380 


Lys 


Glu 


Leu 


He 






395 




Ser 


He 


Ala 


Asn 




410 






Phe 


Ala 


Pro 


Leu 


425 








lie 


Thr 


Cys 


vax 


Ser 


Asp 


Leu 


Phe 








460 


Val 


Gin 


Ser 


Gly 






475 




Leu 


Leu 


Ala 


Trp 




490 






Tyr 


Gly 


Val 


Asp 


505 








Trp 


Ser 


Glu 


Glu 


Phe 


Asp 


Arg 


Val 








540 


Asp 


Phe 


Ala 


Thr 






555 




Gly 


He 


Phe 


Thr 


570 






Gin 


Lys 


Arg 


Trp 


585 








Gly 


Lys 


Gin 





His 


Tyr 


Gly 


Lys 






95 




Gly 


Gly 


Tyr 


Thr 




110 






Gly 


Lys 


Ser 


Val 


125 








urxn 




Tip 

X xcs 




Lys 


Gin 


Ser 


Tyr 








160 


Ser 


Val 


Met 


Leu 






175 




Val 


Val 


Thr 


His 




190 






Gin 


Val 


Val 


Ala 


205 








Gin 


Gin 


Val 


Val 


Val 


Asn 


Pro 


His 








240 


Cys 


Val 


Thr 


Ala 






255 




Val 


Gly 


He 


Arg 




270 






His 


Lys 


Pro 


Phe 


285 








Leu 


Arg 


VjXy 


Lys 


Leu 


Met 


Asp 


Trp 








320 


Tyr 


Ala 


Glu 


Glu 




335 




He 


Asp 


Glu 


Thr 




350 






Phe 


Glu 


Ala 


Gly 


365 








Asn 


Gly 


Glu 


Thr 


Ala 


Arg 


Asp 


Lys 








400 


Glu 


Pro 


Asp 


Thr 






415 




Ala 


Glu 


Ala 


Thr 




430 






Asn 


Val 


Met 


Phe 


445 








Asp 


vax 


Leu 


Cys 


Asp 


Leu 


Glu 


Thr 








480 


Gin 


Glu 


Lys 


Leu 






495 




Thr 


Leu 


Ala 


Gly 




510 






Tyr 


Gin 


CyB 


Ala 


525 








Ser 


Ala 


Val 


Val 


Ser 


Gin 


Gly 


He 








560 


Arg 


Asp 


Arg 


Lys 






575 




Thr 


Gly 


Met 


Asn 



590 
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<210> 18 
<211> 277 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Nopal ine Synthase Terminator* Sequence 
<300> 

<308> Genbank #U09365 
<309> 1995-10-17 

<400> 18 

gagctcgaat ttccccgatc gttcaaacat 
tgttgccggt cttgcgatga ttatcatata 
aattaacatg taatgcatga cgttatttat 
attatacatt taatacgcga tagaaaacaa 
gcgcgcggtg tcatctatgt tactagatcg 

<210> 19 
<211> 3438 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pLIT38attBZeo Plasmid 
<400> 19 

tcgaccctct agtcaaggcc ttaagtgagt cgtattacgg actggccgtc gttttacaac 60 
gtcgtgactg ggaaaaccct ggcgttaccc aacttaatcg ccttgcagca catccccctt 120 
tcgccagctg gcgtaatagc gaagaggccc gcaccgatcg cccttcccaa cagttgcgca 180 
gcctgaatgg cgaatggcgc ttcgcttggt aataaagccc gcttcggcgg gctttttttt 24 0 
gttaactacg tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt 3 00 
tttctaaata cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca 3 60 
ataatattga aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt 420 
ttttgcggca ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga 480 
tgctgaagat cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcggtaa 540 
gatccttgag agttttcgcc ccgaagaacg ttctccaatg atgagcactt ttaaagttct 600 
gctatgtggc gcggtattat cccgtgttga cgccgggcaa gagcaactcg gtcgccgcat 660 
acactattct cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga 720 
tggcatgaca gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc 780 
caacttactt ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat 840 
gggggatcat gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa 900 
cgacgagcgt gacaccacga tgcctgtagc aatggcaaca acgttgcgca aactattaac 960 
tggcgaacta cttactctag cttcccggca acaattaata gactggatgg aggcggataa 1020 
agttgcagga ccacttctgc gctcggccct tccggctggc tggtttattg ctgataaatc 1080 
tggagccggt gagcgtgggt ctcgcggtat cattgcagca ctggggccag atggtaagcc 1140 
ctcccgtatc gtagttatct acacgacggg gagtcaggca actatggatg aacgaaatag 12 00 
acagatcgct gagataggtg cctcactgat taagcattgg taactgtcag accaagttta 12 60 
ctcatatata ctttagattg atttaccccg gttgataatc agaaaagccc caaaaacagg 1320 
aagattgtat aagcaaatat ttaaattgta aacgttaata ttttgttaaa attcgcgtta 13 80 
aatttttgtt aaatcagctc attttttaac caataggccg aaatcggcaa aatcccttat 1440 
aaatcaaaag aatagcccga gatagggttg agtgttgttc cagtttggaa caagagtcca 1500 
ctattaaaga acgtggactc caacgtcaaa gggcgaaaaa ccgtctatca gggcgatggc 1560 
ccactacgtg aaccatcacc caaatcaagt tttttggggt cgaggtgccg taaagcacta 1620 
aatcggaacc ctaaagggag cccccgattt agagcttgac ggggaaagcg aacgtggcga 1680 
gaaaggaagg gaagaaagcg aaaggagcgg gcgctagggc gctggcaagt gtagcggtca 1740 
cgctgcgcgt aaccaccaca cccgccgcgc ttaatgcgcc gctacagggc gcgtaaaagg 1800 
atctaggtga agatcctttt tgataatctc atgaccaaaa tcccttaacg tgagttttcg 1860 
ttccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga tccttttttt 1920 
ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg 1980 
ccggatcaag agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata 2040 
ccaaatactg ttcttctagt gtagccgtag ttaggccacc acttcaagaa ctctgtagca 2100 
ccgcctacat acctcgctct gctaatcctg ttaccagtgg ctgctgccag tggcgataag 2160 
tcgtgtctta ccgggttgga ctcaagacga tagttaccgg ataaggcgca gcggtcgggc 2220 
tgaacggggg gttcgtgcac acagcccagc ttggagcgaa cgacctacac cgaactgaga 2280 
tacctacagc gtgagctatg agaaagcgcc acgcttcccg aagggagaaa ggcggacagg 2 340 



ttggcaataa agtttcttaa gattgaatcc 60 
atttctgttg aattacgtta agcatgtaat 120 
gagatgggtt tttatgatta gagtcccgca 180 
aatatagcgc gcaaactagg ataaattatc 240 
ggaattc 277 
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tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc agggggaaac 2400 
gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg 2460 
tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg 2520 
ttcctggcct tttgctggcc ttttgctcac atgtaatgtg agttagctca ctcattaggc 2580 
accccaggct ttacacttta tgcttccggc tcgtatgttg tgtggaattg tgagcggata 2640 
acaatttcac acaggaaaca gctatgacca tgattacgcc aagctacgta atacgactca 2700 
ctagtggggc ccgtgcaatt gaagccggct ggcgccaagc ttctctgcag gattgaagcc 2760 
tgctttttta tactaacttg agcgaaatct ggatccatgg ccaagttgac cagtgccgtt 2820 
ccggtgctca ccgcgcgcga cgtcgccgga gcggtcgagt tctggaccga ccggctcggg 2 880 
ttctcccggg acttcgtgga ggacgacttc gccggtgtgg tccgggacga cgtgaccctg 2940 
ttcatcagcg cggtccagga ccaggtggtg ccggacaaca ccctggcctg ggtgtgggtg 3000 
cgcggcctgg acgagctgta cgccgagtgg tcggaggtcg tgtccacgaa cttccgggac 3 060 
gcctccgggc cggccatgac cgagatcggc gagcagccgt gggggcggga gttcgccctg 3120 
cgcgacccgg ccggcaactg cgtgcacttc gtggccgagg agcaggactg acacgtgcta 3180 
cgagatttcg attccaccgc cgccttctat gaaaggttgg gcttcggaat cgttttccgg 3240 
gacgccggct ggatgatcct ccagcgcggg gatctcatgc tggagttctt cgcccacccc 3300 
aacttgttta ttgcagctta taatggttac aaataaagca atagcatcac aaatttcaca 33 60 
aataaagcat ttttttcact gcattctagt tgtggtttgt ccaaactcat caatgtatct 3420 
tatcatgtct gtataccg 343 8 

<210> 20 
<211> 3451 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Hindlll Fragment containing the beta-glucuronidase 
coding sequence, the rDNA intergenic spacer, and 
the Mastl sequence 

<400> 20 

aagcttgacc tggaatatcg cgagtaaact gaaaatcacg gaaaatgaga aatacacact 60 
ttaggacgtg aaatatggcg aggaaaactg aaaaaggtgg aaaatttaga aatgtccact 120 
gtaggacgtg gaatatggca agaaaactga aaatcatgga aaatgagaaa catccacttg 180 
acgacttgaa aaatgacgaa atcactaaaa aacgtgaaaa atgagaaatg cacactgaag 240 
gactccgcgg gaattcgatt gtgctagcca atgtttaaca agatgtcaag cacaatgaat 300 
gttggtggtt ggtggtcgtg gctggcggtg gtggaaaatt gcggtggttc gagcggtagt 360 
gatcggcgat ggttggtgtt tgcagcggtg tttgatatcg gaatcactta tggtggttgt 420 
cacaatggag gtgcgtcatg gttattggtg gttggtcatc tatatatttt tataataata 480 
ttaagtattt tacctatttt ttacatattt tttattaaat ttatgcattg tttgtatttt 540 
taaatagttt ttatcgtact tgttttataa aatattttat tattttatgt gttatattat 600 
tacttgatgt attggaaatt ttctccattg ttttttctat atttataata attttcttat 660 
ttttttttgt tttattatgt attttttcgt tttataataa atatttatta aaaaaaatat 720 
tatttttgta aaatatatca tttacaatgt ttaaaagtca tttgtgaata tattagctaa 780 
gttgtacttc tttttgtgca tttggtgttg tacatgtcta ttatgattct ctggccaaaa 840 
catgtctact cctgtcactt gggttttttt ttttaagaca taatcactag tgattatatc 900 
tagactgaag gcgggaaacg acaatctgat catgagcgga gaattaaggg agtcacgtta 960 
tgacccccgc cgatgacgcg ggacaagccg ttttacgttt ggaactgaca gaaccgcaac 1020 
gttgaaggag ccactcagcc gcgggtttct ggagtttaat gagctaagca catacgtcag 1080 
aaaccattat tgcgcgttca aaagtcgcct aaggtcacta tcagctagca aatatttctt 1140 
gtcaaaaatg ctccactgac gttccataaa ttcccctcgg tatccaatta gagtctcata 1200 
ttcactctca atccaaataa tctgcaccgg atctcgagat cgaattcccg cggccgcgaa 1260 
ttcactagtg gatccccggg tacggtcagt cccttatgtt acgtcctgta gaaaccccaa 1320 
cccgtgaaat caaaaaactc gacggcctgt gggcattcag tctggatcgc gaaaactgtg 13 80 
gaattgagca gcgttggtgg gaaagcgcgt tacaagaaag ccgggcaatt gctgtgccag 1440 
gcagttttaa cgatcagttc gccgatgcag atattcgtaa ttatgtgggc aacgtctggt 1500 
atcagcgcga agtctttata ccgaaaggtt gggcaggcca gcgtatcgtg ctgcgtttcg 1560 
atgcggtcac tcatfcacggc aaagtgtggg tcaataatca ggaagtgatg gagcatcagg 1620 
gcggctatac gccatttgaa gccgatgtca cgccgtatgt tattgccggg aaaagtgtac 1680 
gtatcacagt ttgtgtgaac aacgaactga actggcagac tatcccgccg ggaatggtga 1740 
ttaccgacga aaacggcaag aaaaagcagt cttacttcca tgatttcttt aactacgccg 1800 
ggatccatcg cagcgtaatg ctctacacca cgccgaacac ctgggtggac gatatcaccg 1860 
tggtgacgca tgtcgcgcaa gactgtaacc acgcgtctgt tgactggcag gtggtggcca 1920 
atggtgatgt cagcgttgaa ctgcgtgatg cggatcaaca ggtggttgca actggacaag 1980 
gcaccagcgg gactttgcaa gtggtgaatc cgcacctctg gcaaccgggt gaaggttatc 2040 
tctatgaact gtacgtcaca gccaaaagcc agacagagtg tgatatctac ccgctgcgcg 2100 
tcggcatccg gtcagtggca gtgaagggcg aacagttcct gatcaaccac aaaccgttct 2160 
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actttactgg ctttggccgt 
tgctgatggt gcacgatcac 
cgcattaccc ttacgctgaa 
ttgatgaaac tgcagctgtc 
acaagccgaa agaactgtac 
tacaggcgat taaagagctg 
gtattgccaa cgaaccggat 
cggaagcaac gcgtaaactc 
gcgacgctca caccgatacc 
acggttggta tgtccaaagc 
ttctggcctg gcaggagaaa 
cgttagccgg gctgcactca 
ggctggatat gtatcaccgc 
ggaatttcgc cgattttgcg 
ggatcttcac ccgcgaccgc 
ctggcatgaa cttcggtgaa 
ctggcgcacc atcgtcggct 
tcgttcaaac atttggcaat 
gattatcata taatttctgt 
gacgttattt atgagatggg 
gatagaaaac aaaatatagc 
gttactagat cgggaattcg 

<210> 21 
<211> 14627 
<212> DNA 
<213> Artificial Sequence 

<220> 

<223> pAglla Plasmid 
<400> 21 

catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc ctccgctgct 60 
atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca 120 
agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt 180 
gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac gccatgaaca 240 
agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac caggacttga 300 
ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc gagaagatca 360 
ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta cgccctggcg 42 0 
acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac ctactggaca 480 
ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg 540 
acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg 600 
agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg 660 
tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc cgcgagctga 720 
tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg catcgctcga 780 
ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg 840 
gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac 900 
gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt ttttcattac 960 
cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt 1020 
ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc tggcggcctg 10 80 
gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt gatgtgtatt 1140 
tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag taaataaaca 1200 
aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg cgggtcaggc 1260 
aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc cgatgttctg 1320 
ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa 1380 
ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa ggccatcggc 1440 
cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg 1500 
atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga catatgggcc 1560 
accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg aaggctacaa 1620 
gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag 1680 
gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg cgtgagctac 1740 
ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc 1800 
cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt taatgaggta 1860 
aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc gcacgcagca 1920 
gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg gtcaactttc 1980 
agttgccggc ggaggatcac accaagctga agatgtacgc ggtacgccaa ggcaagacca 2040 
ttaccgagct gctatctgaa tacatcgcgc agctaccaga gtaaatgagc aaatgaataa 2100 



catgaagatg cggatttgcg 
gcattaatgg actggattgg 
gagatgctcg actgggcaga 
ggctttaacc tctctttagg 
agcgaagagg cagtcaacgg 
atagcgcgtg acaaaaacca 
acccgtccgc aaggtgcacg 
gatccgacgc gtccgatcac 
atcagcgatc tctttgatgt 
ggcgatttgg aaacggcaga 
ctgcatcagc cgattatcat 
atgtacaccg acatgtggag 
gtctttgatc gcgtcagcgc 
acctcgcaag gcatattgcg 
aaaccgaagt cggcggcttt 
aaaccgcagc agggaggcaa 
acagcctcgg gaattgcgta 
aaagtttctt aagattgaat 
tgaattacgt taagcatgta 
tttttatgat tagagtcccg 
gcgcaaacta ggataaatta 
atatcaagct t 



cggcaaagga ttcgataacg 2220 
ggccaactcc taccgtacct 2280 
tgaacatggc atcgtggtga 2340 
cattggtttc gaagcgggca 2400 
ggaaactcag caggcgcact 2460 
cccaagcgtg gtgatgtgga 2520 
ggaatatttc gcgccactgg 2580 
ctgcgtcaat gtaatgttct 2640 
gctgtgcctg aaccgttatt 270 0 
gaaggtactg gaaaaagaac 2760 
caccgaatac ggcgtggata 2 820 
tgaagagtat cagtgtgcat 2 880 
cgtcgtcggt gaacaggtat 2940 
cgttggcggt aacaagaagg 3 000 
tctgctgcaa aaacgctgga 3 060 
acaatgaatc aacaactctc 3120 
ccgagctcga atttccccga 3180 
cctgttgccg gtcttgcgat 3240 
ataattaaca tgtaatgcat 3300 
caattataca tttaatacgc 3360 
tcgcgcgcgg tgtcatctat 3420 

3451 
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atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga acaaccaggc 2160 
accgacgccg tggaatgccc catgtgtgga ggaacgggcg gttggccagg cgtaagcggc 2220 
tgggttgtct gccggccctg caatggcact ggaaccccca agcccgagga atcggcgtga 2280 
cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg ctgggtgatg acctggtgga 2340 
gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc gaggcagaag cacgccccgg 2400 
tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa tcccggcaac cgccggcagc 2460 
cggtgcgccg tcgattagga agccgcccaa gggcgacgag caaccagatt ttttcgttcc 2520 
gatgctctat gacgtgggca cccgcgatag tcgcagcatc atggacgtgg ccgttttccg 2580 
tctgtcgaag cgtgaccgac gagctggcga ggtgatccgc tacgagcttc cagacgggca 2640 
cgtagaggtt tccgcagggc cggccggcat ggccagtgtg tgggattacg acctggtact 2700 
gatggcggtt tcccatctaa ccgaatccat gaaccgatac cgggaaggga agggagacaa 2760 
gcccggccgc gtgttccgtc cacacgttgc ggacgtactc aagttctgcc ggcgagccga 2820 
tggcggaaag cagaaagacg acctggtaga aacctgcatt cggttaaaca ccacgcacgt 2880 
tgccatgcag cgtacgaaga aggccaagaa cggccgcctg gtgacggtat ccgagggtga 2940 
agccttgatt agccgctaca agatcgtaaa gagcgaaacc gggcggccgg agtacatcga 3000 
gatcgagcta gctgattgga tgtaccgcga gatcacagaa ggcaagaacc cggacgtgct 3060 
gacggttcac cccgattact ttttgatcga tcccggcatc ggccgttttc tctaccgcct 3120 
ggcacgccgc gccgcaggca aggcagaagc cagatggttg ttcaagacga tctacgaacg 3180 
cagtggcagc gccggagagt tcaagaagtt ctgtttcacc gtgcgcaagc tgatcgggtc 3240 
aaatgacctg ccggagtacg atttgaagga ggaggcgggg caggctggcc cgatcctagt 3300 
catgcgctac cgcaacctga tcgagggcga agcatccgcc ggttcctaat gtacggagca 3360 
gatgctaggg caaattgccc tagcagggga aaaaggtcga aaaggtctct ttcctgtgga 3420 
tagcacgtac attgggaacc caaagccgta cattgggaac cggaacccgt acattgggaa 3480 
cccaaagccg tacattggga accggtcaca catgtaagtg actgatataa aagagaaaaa 3540 
aggcgatttt tccgcctaaa actctttaaa acttattaaa actcttaaaa cccgcctggc 3 600 
ctgtgcataa ctgtctggcc agcgcacagc cgaagagctg caaaaagcgc ctacccttcg 3660 
gtcgctgcgc tccctacgcc ccgccgcttc gcgtcggcct atcgcggccg ctggccgctc 3720 
aaaaatggct ggcctacggc caggcaatct accagggcgc ggacaagccg cgccgtcgcc 3780 
actcgaccgc cggcgcccac atcaaggcac ccitgcctcgc gcgtttcggt gatgacggtg 3840 
aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg 3900 
ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg ggcgcagcca 3960 
tgacccagtc acgtagcgat agcggagtgt atactggctt aactatgcgg catcagagca 4020 
gattgtactg agagtgcacc atatgcggtg tgaaataccg cacagatgcg taaggagaaa 4080 
ataccgcatc aggcgctctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 4140 
gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg 4200 
ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 4260 
ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 4320 
acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 4380 
tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 4440 
ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 4500 
ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 4560 
ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 4620 
actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 4680 
gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg gtatctgcgc 4740 
tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 4800 
caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 4860 
atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 4920 
acgttaaggg attttggtca tgcattctag gtactaaaac aattcatcca gtaaaatata 4980 
atattttatt ttctcccaat caggcttgat ccccagtaag tcaaaaaata gctcgacata 5040 
ctgttcttcc ccgatatcct ccctgatcga ccggacgcag aaggcaatgt cataccactt 5100 
gtccgccctg ccgcttctcc caagatcaat aaagccactt actttgccat ctttcacaaa 5160 
gatgttgctg tctcccaggt cgccgtggga aaagacaagt tcctcttcgg gcttttccgt 5220 
ctttaaaaaa tcatacagct cgcgcggatc tttaaatgga gtgtcttctt cccagttttc 5280 
gcaatccaca tcggccagat cgttattcag taagtaatcc aattcggcta agcggctgtc 5340 
taagctattc gtatagggac aatccgatat gtcgatggag tgaaagagcc tgatgcactc 5400 
cgcatacagc tcgataatct tttcagggct ttgttcatct tcatactctt ccgagcaaag 5460 
gacgccatcg gcctcactca tgagcagatt gctccagcca tcatgccgtt caaagtgcag 5520 
gacctttgga acaggcagct ttccttccag ccatagcatc atgtcctttt cccgttccac 5580 
atcataggtg gtccctttat accggctgtc cgtcattttt aaatataggt tttcattttc 5640 
tcccaccagc ttatatacct tagcaggaga cattccttcc gtatctttta cgcagcggta 5700 
tttttcgatc agttttttca attccggtga tattctcatt ttagccattt attatttcct 5760 
tcctcttttc tacagtattt aaagataccc caagaagcta attataacaa gacgaactcc 5820 
aattcactgt tccttgcatt ctaaaacctt aaataccaga aaacagcttt ttcaaagttg 5880 
ttttcaaagt tggcgtataa catagtatcg acggagccga ttttgaaacc gcggtgatca 5940 
caggcagcaa cgctctgtca tcgttacaat caacatgcta ccctccgcga gatcatccgt 6000 
gtttcaaacc cggcagctta gttgccgttc ttccgaatag catcggtaac atgagcaaag 6060 
tctgccgcct tacaacggct ctcccgctga cgccgtcccg gactgatggg ctgcctgtat 6120 
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cgagtggtga ttttgtgccg agctgccggt 
tatattgtgg tgtaaacaaa ttgacgctta 
taatgtactg aattaacgcc gaattaattc 
gttttaggaa ttagaaattt tattgataga 
ggtttcttat atgctcaaca catgagcgaa 
ggaactactc acacattatt atggagaaac 
ggacggggcg gtaccggcag gctgaagtcc 
ccgtgcttga agccggccgc ccgcagcatg 
atgcgcacgc tcgggtcgtt gggcagcccg 
gcctccaggg acttcagcag gtgggtgtag 
cggggggaga cgtacacggt cgactcggcc 
gggcccgcgt aggcgatgcc ggcgacctcg 
cgctcccgca gacggacgag gtcgtccgtc 
aagttgaccg tgcttgtctc gatgtagtgg 
gcctcggtgg cacggcggat gtcggccggg 
gagatagatt tgtagagaga gactggtgat 
ttccttatat agaggaaggt cttgcgaagg 
agtggagata tcacatcaat ccacttgctt 
cacgatgctc ctcgtgggtg ggggtccatc 
aacgatagcc tttcctttat cgcaatgatg 
tgtccttttg atgaagtgac agatagctgg 
taccctttgt tgaaaagtct caatagccct 
cttggagtag acgagagtgt cgtgctccac 
agacgtggtt ggaacgtctt ctttttccac 
gggaccactg tcggcagagg catcttgaac 
tttgtaggtg ccaccttcct tttctactgt 
atggaatccg aggaggtttc ccgatattac 
gtcttctgag actgtatctt tgatattctt 
gttggcaagc tgctctagcc aatacgcaaa 
taatgcagct ggcacgacag gtttcccgac 
aatgtgagtt agctcactca ttaggcaccc 
atgttgtgtg gaattgtgag cggataacaa 
tacgaattcg agccttgact agagggtcga 
gagtttggac aaaccacaac tagaatgcag 
gatgctattg ctttatttgt aaccattata 
gaactccagc atgagatccc cgcgctggag 
tccgaagccc aacctttcat agaaggcggc 
gtcctgctcc tcggccacga agtgcacgca 
ccgcccccac ggctgctcgc cgatctcggt 
cgtggacacg acctccgacc actcggcgta 
ggccagggtg ttgtccggca ccacctggtc 
gtcccggacc acaccggcga agtcgtcctc 
ggtccagaac tcgaccgctc cggcgacgtc 
caacttggcc atggatccag atttcgctca 
gcaggaattc gatcgacact ctcgtctact 
accaaagggc tattgagact tttcaacaaa 
attgcccagc tatctgtcac ttcatcaaaa 
aatgccatca ttgcgataaa ggaaaggcta 
ccaaagatgg acccccaccc acgaggagca 
cttcaaagca agtggattga tgtgataaca 
agaatatcaa agafcacagtc tcagaagacc 
taatatcggg aaacctcctc ggattccatt 
cagtagaaaa ggaaggtggc acctacaaat 
ttcaagatgc ctctgccgac agtggtccca 
tggaaaaaga agacgttcca accacgtctt 
ctgacgtaag ggatgacgca caatcccact 
aagttcattt catttggaga ggacacgctg 
tctctcgagc tttcgcagat ccgggggggc 
cgacgtctgt cgagaagttt ctgatcgaaa 
tctcggaggg cgaagaatct cgtgctttca 
tgcgggtaaa tagctgcgcc gatggtttct 
catcggccgc gctcccgatt ccggaagtgc 
cctattgcat ctcccgccgt gcacagggtg 
tgcccgctgt tctacaaccg gtcgcggagg 
gccagacgag cgggttcggc ccattcggac 
gtgatttcat atgcgcgatt gctgatcccc 
acaccgtcag tgcgtccgtc gcgcaggctc 



-26- 

cggggagctg ttggctggct ggtggcagga 6180 
gacaacttaa taacacattg cggacgtttt 6240 
gggggatctg gattttagta ctggattttg 63 00 
agtattttac aaatacaaat acatactaag 6360 
accctatagg aaccctaatt cccttatctg 6420 
tcgagtcaaa tctcggtgac gggcaggacc 6480 
agctgccaga aacccacgtc atgccagttc 6540 
ccgcgggggg catatccgag cgcctcgtgc 6600 
atgacagcga ccacgctctt gaagccctgt 6660 
agcgtggagc ccagtcccgt ccgctggtgg 6720 
gtccagtcgt aggcgttgcg tgccttccag 6780 
ccgtccacct cggcgacgag ccagggatag 6840 
cactcctgcg gttcctgcgg ctcggtacgg 6900 
ttgacgatgg tgcagaccgc cggcatgtcc 6960 
cgtcgttctg ggctcatggt agactcgaga 7020 
ttcagcgtgt cctctccaaa tgaaatgaac 7080 
atagtgggat tgtgcgtcat cccttacgtc 7140 
tgaagacgtg gttggaacgt cttctttttc 7200 
tttgggacca ctgtcggcag aggcatcttg 7260 
gcatttgtag gtgccacctt ccttttctac 7320 
gcaatggaat ccgaggaggt ttcccgatat 73 80 
ttggtcttct gagactgtat ctttgatatt 7440 
catgttatca catcaatcca cttgctttga 7500 
gatgctcctc gtgggtgggg gtccatcttt 7560 
gatagccttt cctttatcgc aatgatggca 7620 
ccttttgatg aagtgacaga tagctgggca 7680 
cctttgttga aaagtctcaa tagccctttg 7740 
ggagtagacg agagtgtcgt gctccaccat 7800 
ccgcctctcc ccgcgcgttg gccgattcat 7860 
tggaaagcgg gcagtgagcg caacgcaatt 7920 
caggctttac actttatgct tccggctcgt 7980 
tttcacacag gaaacagcta tgaccatgat 8040 
cggtatacag acatgataag atacattgat 8100 
tgaaaaaaat gctttatttg tgaaatttgt 8160 
agctgcaata aacaagttgg ggtgggcgaa 8220 
gatcatccag ccggcgtccc ggaaaacgat 8280 
ggtggaatcg aaatctcgta gcacgtgtca 8340 
gttgccggcc gggtcgcgca gggcgaactc 8400 
catggccggc ccggaggcgt cccggaagtt 8460 
cagctcgtcc aggccgcgca cccacaccca 8520 
ctggaccgcg ctgatgaaca gggtcacgtc 8580 
cacgaagtcc cgggagaacc cgagccggtc 8640 
gcgcgcggtg agcaccggaa cggcactggt 8700 
agttagtata aaaaagcagg cttcaatcct 8760 
ccaagaatat caaagataca gtctcagaag 882 0 
gggtaatatc gggaaacctc ctcggattcc 8880 
ggacagtaga aaaggaaggt ggcacctaca 8940 
tcgttcaaga tgcctctgcc gacagtggtc 9000 
tcgtggaaaa agaagacgtt ccaaccacgt 9060 
tggtggagca cgacactctc gtctactcca 9120 
aaagggctat tgagactttt caacaaaggg 9180 
gcccagctat ctgtcacttc atcaaaagga 9240 
gccatcattg cgataaagga aaggctatcg 9300 
aagatggacc cccacccacg aggagcatcg 9360 
caaagcaagt ggattgatgt gatatctcca 9420 
atccttcgca agaccttcct ctatataagg 9480 
aaatcaccag tctctctcta caaatctatc 9540 
aatgagatat gaaaaagcct gaactcaccg 9600 
agttcgacag cgtctccgac ctgatgcagc 9660 
gcttcgatgt aggagggcgt ggatatgtcc 9720 
acaaagatcg ttatgtttat cggcactttg 9780 
ttgacattgg ggagtttagc gagagcctga 9840 
tcacgttgca agacctgcct gaaaccgaac 9900 
ctatggatgc gatcgctgcg gccgatctta 9960 
cgcaaggaat cggtcaatac actacatggc 10 020 
atgtgtatca ctggcaaact gtgatggacg 10080 
tcgatgagct gatgctttgg gccgaggact 10140 
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gccccgaagt ccggcacctc gtgcacgcgg atttcggctc caacaatgtc ctgacggaca 10200 
atggccgcat aacagcggtc attgactgga gcgaggcgat gttcggggat tcccaatacg 10260 
aggtcgccaa catcttcttc tggaggccgt ggttggcttg tatggagcag cagacgcgct 1032 0 
acttcgagcg gaggcatccg gagcttgcag gatcgccacg actccgggcg tatatgctcc 10380 
gcattggtct tgaccaactc tatcagagct tggttgacgg caatttcgat gatgcagctt 10440 
gggcgcaggg tcgatgcgac gcaatcgtcc gatccggagc cgggactgtc gggcgtacac 1050 0 
aaatcgcccg cagaagcgcg gccgtctgga ccgatggctg tgtagaagta ctcgccgata 10560 
gtggaaaccg acgccccagc actcgtccga gggcaaagaa atagagtaga tgccgaccgg 1062 0 
atctgtcgat cgacaagctc gagtttctcc ataataatgt gtgagtagtt cccagataag 10680 
ggaattaggg ttcctatagg gtttcgctca tgtgttgagc atataagaaa cccttagtat 10740 
gtatttgtat ttgtaaaata cttctatcaa taaaatttct aattcctaaa accaaaatcc 10800 
agtactaaaa tccagatccc ccgaattaat tcggcgttaa ttcagatcaa gcttgacctg 10860 
gaatatcgcg agtaaactga aaatcacgga aaatgagaaa tacacacttt aggacgtgaa 10920 
atatggcgag gaaaactgaa aaaggtggaa aatttagaaa tgtccactgt aggacgtgga 10980 
atatggcaag aaaactgaaa atcatggaaa atgagaaaca tccacttgac gacttgaaaa 11040 
atgacgaaat cactaaaaaa cgtgaaaaat gagaaatgca cactgaagga ctccgcggga 1110 0 
attcgattgt gctagccaat gtttaacaag atgtcaagca caatgaatgt tggtggttgg 11160 
tggtcgtggc tggcggtggt ggaaaattgc ggtggttcga gcggtagtga tcggcgatgg 1122 0 
ttggtgtttg cagcggtgtt tgatatcgga atcacttatg gtggttgtca caatggaggt 11280 
gcgtcatggt tattggtggt tggtcatcta tatattttta taataatatt aagtatttta 1134 0 
cctatttttt acatattttt tattaaattt atgcattgtt tgtattttta aatagttttt 11400 
atcgtacttg ttttataaaa tattttatta ttttatgtgt tatattatta cttgatgtat 11460 
tggaaatttt ctccattgtt ttttctatat ttataataat tttcttattt ttttttgttt 11520 
tattatgtat tttttcgttt tataataaat atttattaaa aaaaatatta tttttgtaaa 11580 
atatatcatt tacaatgttt aaaagtcatt tgtgaatata ttagctaagt tgtacttctt 11640 
tttgtgcatt tggtgttgta catgtctatt atgattctct ggccaaaaca tgtctactcc 11700 
tgtcacttgg gttttttttt ttaagacata atcactagtg attatatcta gactgaaggc 11760 
gggaaacgac aatctgatca tgagcggaga attaagggag tcacgttatg acccccgccg 1182 0 
atgacgcggg acaagccgtt ttacgtttgg aactgacaga accgcaacgt tgaaggagcc 11880 
actcagccgc gggtttctgg agtttaatga gctaagcaca tacgtcagaa accattattg 11940 
cgcgttcaaa agtcgcctaa ggtcactatc agctagcaaa tatttcttgt caaaaatgct 1200 0 
ccactgacgt tccataaatt cccctcggta tccaattaga gtctcatatt cactctcaat 12060 
ccaaataatc tgcaccggat ctcgagatcg aattcccgcg gccgcgaatt cactagtgga 12120 
tccccgggta cggtcagtcc cttatgttac gtcctgtaga aaccccaacc cgtgaaatca 12180 
aaaaactcga cggcctgtgg gcattcagtc tggatcgcga aaactgtgga attgagcagc 1224 0 
gttggtggga aagcgcgtta caagaaagcc gggcaattgc tgtgccaggc agttttaacg 12300 
atcagttcgc cgatgcagat attcgtaatt atgtgggcaa cgtctggtat cagcgcgaag 12360 
tctttatacc gaaaggttgg gcaggccagc gtatcgtgct gcgtttcgat gcggtcactc 12420 
attacggcaa agtgtgggtc aataatcagg aagtgatgga gcatcagggc ggctatacgc 12480 
catttgaagc cgatgtcacg ccgtatgtta ttgccgggaa aagtgtacgt atcacagttt 12540 
gtgtgaacaa cgaactgaac tggcagacta tcccgccggg aatggtgatt accgacgaaa 12600 
acggcaagaa aaagcagtct tacttccatg atttctttaa ctacgccggg atccatcgca 12660 
gcgtaatgct ctacaccacg ccgaacacct gggtggacga tatcaccgtg gtgacgcatg 12720 
tcgcgcaaga ctgtaaccac gcgtctgttg actggcaggt ggtggccaat ggtgatgtca 12780 
gcgttgaact gcgtgatgcg gatcaacagg tggttgcaac tggacaaggc accagcggga 12840 
ctttgcaagt ggtgaatccg cacctctggc aaccgggtga aggttatctc tatgaactgt 12900 
acgtcacagc caaaagccag acagagtgtg atatctaccc gctgcgcgtc ggcatccggt 12960 
cagtggcagt gaagggcgaa cagttcctga tcaaccacaa accgttctac tttactggct 13020 
ttggccgtca tgaagatgcg gatttgcgcg gcaaaggatt cgataacgtg ctgatggtgc 13 080 
acgatcacgc attaatggac tggattgggg ccaactccta ccgtacctcg cattaccctt 13140 
acgctgaaga gatgctcgac tgggcagatg aacatggcat cgtggtgatt gatgaaactg 13200 
cagctgtcgg ctttaacctc tctttaggca ttggtttcga agcgggcaac aagccgaaag 13260 
aactgtacag cgaagaggca gtcaacgggg aaactcagca ggcgcactta caggcgatta 13320 
aagagctgat agcgcgtgac aaaaaccacc caagcgtggt gatgtggagt attgccaacg 13380 
aaccggatac ccgtccgcaa ggtgcacggg aatatttcgc gccactggcg gaagcaacgc 13440 
gtaaactcga tccgacgcgt ccgatcacct gcgtcaatgt aatgttctgc gacgctcaca 13500 
ccgataccat cagcgatctc tttgatgtgc tgtgcctgaa ccgttattac ggttggtatg 13560 
tccaaagcgg cgatttggaa acggcagaga aggtactgga aaaagaactt ctggcctggc 13620 
aggagaaact gcatcagccg attatcatca ccgaatacgg cgtggatacg ttagccgggc 13 680 
tgcactcaat gtacaccgac atgtggagtg aagagtatca gtgtgcatgg ctggatatgt 13740 
atcaccgcgt ctttgatcgc gtcagcgccg tcgtcggtga acaggtatgg aatttcgccg 13800 
attttgcgac ctcgcaaggc atattgcgcg ttggcggtaa caagaagggg atcttcaccc 13 860 
gcgaccgcaa accgaagtcg gcggcttttc tgctgcaaaa acgctggact ggcatgaact 13920 
tcggtgaaaa accgcagcag ggaggcaaac aatgaatcaa caactctcct ggcgcaccat 13980 
cgtcggctac agcctcggga attgcgtacc gagctcgaat ttccccgatc gttcaaacat 14040 
ttggcaataa agtttcttaa gattgaatcc tgttgccggt cttgcgatga ttatcatata 14100 
atttctgttg aattacgtta agcatgtaat aattaacatg taatgcatga cgttatttat 14160 
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gagatgggtt tttatgatta gagtcccgca attatacatt taatacgcga tagaaaacaa 
aatatagcgc gcaaactagg ataaattatc gcgcgcggtg tcatctatgt tactagatcg 
ggaattcgat atcaagcttg gcactggccg tcgttttaca acgtcgtgac tgggaaaacc 
ctggcgttac ccaacttaat cgccttgcag cacatccccc tttcgccagc tggcgtaata 
gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg cagcctgaat ggcgaatgct 
agagcagctt gagcttggat cagattgtcg tttcccgcct tcagtttaaa ctatcagtgt 
ttgacaggat atattggcgg gtaaacctaa gagaaaagag cgtttattag aataacggat 
atttaaaagg gcgtgaaaag gtttatccgt tcgtccattt gtatgtg 

<210> 22 
<211> 4257 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pPUR Plasmid 
<400> 22 

ctgtggaatg tgtgtcagtt agggtgtgga aagtccccag gctccccagc aggcagaagt 60 
atgcaaagca tgcatctcaa ttagtcagca accaggtgtg gaaagtcccc aggctcccca 120 
gcaggcagaa gtatgcaaag catgcatctc aattagtcag caaccatagt cccgccccta 180 
actccgccca tcccgcccct aactccgccc agttccgccc attctccgcc ccatggctga 240 
ctaatttttt ttatttatgc agaggccgag gccgcctcgg cctctgagct attccagaag 3 00 
tagtgaggag gcttttttgg aggcctaggc ttttgcaaaa agcttgcatg cctgcaggtc 360 
ggccgccacg accggtgccg ccaccatccc ctgacccacg cccctgaccc ctcacaagga 420 
gacgaccttc catgaccgag tacaagccca cggtgcgcct cgccacccgc gacgacgt cc 480 
cccgggccgt acgcaccctc gccgccgcgt tcgccgacta ccccgccacg cgccacaccg 540 
tcgacccgga ccgccacatc gagcgggtca ccgagctgca agaactcttc ctcacgcgcg 600 
tcgggctcga catcggcaag gtgtgggtcg cggacgacgg cgccgcggtg gcggtctgga 660 
ccacgccgga gagcgtcgaa gcgggggcgg tgttcgccga gatcggcccg cgcatggccg 720 
agttgagcgg ttcccggctg gccgcgcagc aacagatgga aggcctcctg gcgccgcacc 780 
ggcccaagga gcccgcgtgg ttcctggcca ccgtcggcgt ctcgcccgac caccagggca 840 
agggtctggg cagcgccgtc gtgctccccg gagtggaggc ggccgagcgc gccggggtgc 900 
ccgccttcct ggagacctcc gcgccccgca acctcccctt ctacgagcgg ctcggcttca 960 
ccgtcaccgc cgacgtcgag gtgcccgaag gaccgcgcac ctggtgcatg acccgcaagc 1020 
ccggtgcctg acgcccgccc cacgacccgc agcgcccgac cgaaaggagc gcacgacccc 1080 
atggctccga ccgaagccga cccgggcggc cccgccgacc ccgcacccgc ccccgaggcc 1140 
caccgactct agaggatcat aatcagccat accacatttg tagaggtttt acttgcttta 1200 
aaaaacctcc cacacctccc cctgaacctg aaacataaaa tgaatgcaat tgttgttgtt 1260 
aacttgttta ttgcagctta taatggttac aaataaagca atagcatcac aaatttcaca 1320 
aataaagcat ttttttcact gcattctagt tgtggtttgt ccaaactcat caatgtatct 1380 
tatcatgtct ggatccccag gaagctcctc tgtgtcctca taaaccctaa cctcctctac 1440 
ttgagaggac attccaatca taggctgccc atccaccctc tgtgtcctcc tgttaattag 1500 
gtcacttaac aaaaaggaaa ttgggtaggg gtttttcaca gaccgctttc taagggtaat 1560 
tttaaaatat ctgggaagtc ccttccactg ctgtgttcca gaagtgttgg taaacagccc 1620 
acaaatgtca acagcagaaa catacaagct gtcagctttg cacaagggcc caacaccctg 1680 
ctcatcaaga agcactgtgg ttgctgtgtt agtaatgtgc aaaacaggag gcacattttc 1740 
cccacctgtg taggttccaa aatatctagt gttttcattt ttacttggat caggaaccca 1800 
gcactccact ggataagcat tatccttatc caaaacagcc ttgtggtcag tgttcatctg 1860 
ctgactgtca actgtagcat tttttggggt tacagtttga gcaggatatt tggtcctgta 1920 
gtttgctaac acaccctgca gctccaaagg ttccccacca acagcaaaaa aatgaaaatt 1980 
tgacccttga atgggttttc cagcaccatt ttcatgagtt ttttgtgtcc ctgaatgcaa 2040 
gtttaacata gcagttaccc caataacctc agttttaaca gtaacagctt cccacatcaa 2100 
aatatttcca caggttaagt cctcatttaa attaggcaaa ggaattcttg aagacgaaag 2160 
ggcctcgtga tacgcctatt tttataggtt aatgtcatga taataatggt ttcttagacg 2220 
tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt tttctaaata 2280 
cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca ataatattga 2340 
aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt ttttgcggca 2400 
ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga tgctgaagat 2460 
cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcggtaa gatccttgag 2520 
agttttcgcc ccgaagaacg ttttccaatg atgagcactt ttaaagttct gctatgtggc 2580 
gcggtattat cccgtgttga cgccgggcaa gagcaactcg gtcgccgcat acactattct 2640 
cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga tggcatgaca 2700 
gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc caacttactt 2760 
ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat gggggatcat 2 820 
gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa cgacgagcgt 2880 
gacaccacga tgcctgcagc aatggcaaca acgttgcgca aactattaac tggcgaacta 2940 
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cttactctag cttcccggca acaattaata gactggatgg aggcggataa agttgcagga 3000 
ccacttctgc gctcggccct tccggctggc tggtttattg ctgataaatc tggagccggt 3 060 
gagcgfcgggt ctcgcggtat cattgcagca ctggggccag atggtaagcc ctcccgtatc 3120 
gtagttatct acacgacggg gagtcaggca actatggatg aacgaaatag acagatcgct 3180 
gagataggtg cctcactgat taagcattgg taactgtcag accaagttta ctcatatata 3240 
ctttagattg atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt 3300 
gataatctca tgaccaaaat cccttaacgt gagttttcgt tccactgagc gtcagacccc 3360 
gtagaaaaga tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg 3420 
caaacaaaaa aaccaccgct accagcggtg gtttgtttgc cggatcaaga gcfcaccaact 3480 
ctttttccga aggtaactgg cttcagcaga gcgcagatac caaatactgt ccttctagtg 3540 
tagccgtagt taggccacca cttcaagaac tctgtagcac cgcctacata cctcgctctg 3600 
ctaatcctgt taccagtggc tgctgccagt ggcgataagt cgtgtcttac cgggttggac 3660 
tcaagacgat agttaccgga taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca 3720 
cagcccagct tggagcgaac gacctacacc gaactgagat acctacagcg tgagctatga 3780 
gaaagcgcca cgcttcccga agggagaaag gcggacaggt atccggtaag cggcagggtc 3 840 
ggaacaggag agcgcacgag ggagcttcca gggggaaacg cctggtatct ttatagtcct 3900 
gtcgggtttc gccacctctg acttgagcgt cgatttttgt gatgctcgtc aggggggcgg 3960 
agcctatgga aaaacgccag caacgcggcc tttttacggt tcctggcctt ttgctggcct 4020 
tttgctcaca tgttctttcc tgcgttatcc cctgattctg tggataaccg tattaccgcc 4080 
tttgagtgag ctgataccgc tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc 4140 
gaggaagcgg aagagcgcct gatgcggtat tttctcctta cgcatctgtg cggtatttca 42 00 
caccgcatat ggtgcactct cagtacaatc tgctctgatg ccgcatagtt aagccag 4257 

<210> 23 
<211> 2713 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pNEB193 Plasmid 
<400> 23 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 

cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 

ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 

accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcaggcgcc 240 

attcgccatt caggctgcgc aactgttggg aagggcgatc ggtgcgggcc tcttcgctat 300 

tacgccagct ggcgaaaggg ggatgtgctg caaggcgatt aagttgggta acgccagggt 360 

tttcccagtc acgacgttgt aaaacgacgg ccagtgaatt cgagctcggt acccgggggc 420 

gcgccggatc cttaattaag tctagagtcg actgtttaaa cctgcaggca tgcaagcttg 480 

gcgtaatcat ggtcatagct gtttcctgtg tgaaattgtt atccgctcac aattccacac 540 

aacatacgag ccggaagcat aaagtgtaaa gcctggggtg cctaatgagt gagctaactc 600 

acattaattg cgttgcgctc actgcccgct ttccagtcgg gaaacctgtc gtgccagctg 660 

cattaatgaa tcggccaacg cgcggggaga ggcggtttgc gtattgggcg ctcttccgct 720 

tcctcgctca ctgactcgct gcgctcggtc gttcggctgc ggcgagcggt atcagctcac 780 

tcaaaggcgg taatacggtt atccacagaa tcaggggata acgcaggaaa gaacatgtga 840 

gcaaaaggcc agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat 900 

aggctccgcc cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac 960 

ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 1020 

gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 1080 

ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 1140 

ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1200 

cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 1260 

attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 1320 

ggctacacta gaaggacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga 1380 

aaaagagttg gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt 1440 

gtttgcaagc agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt 1500 

tctacggggt ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga 1560 

ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc 1620 

taaagtatat atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct 1680 

atctcagcga tctgtctatt tcgttcatcc atagttgcct gactccccgt cgtgtagata 1740 

actacgatac gggagggctt accatctggc cccagtgctg caatgatacc gcgagaccca 1800 

cgctcaccgg ctccagattt atcagcaata aaccagccag ccggaagggc cgagcgcaga 1860 

agtggtcctg caactttatc cgcctccatc cagtctatta attgttgccg ggaagctaga 1920 

gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg ccattgctac aggcatcgtg 1980 

gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg gttcccaacg atcaaggcga 2040 

gttacatgat cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc tccgatcgtt 2100 
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gtcagaagta agttggccgc agtgttatca 
cttactgtca tgccatccgt aagatgcttt 
ttctgagaat agtgtatgcg gcgaccgagt 
accgcgccac atagcagaac tttaaaagtg 
aaactctcaa ggatcttacc gctgttgaga 
aactgatctt cagcatcttt tactttcacc 
caaaatgccg caaaaaaggg aataagggcg 
ctttttcaat attattgaag catttatcag 
gaatgtattt agaaaaataa acaaataggg 
cctgacgtct aagaaaccat tattatcatg 
aggccctttc gtc 

<210> 24 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> at t PUP Primer 



-30- 

ctcatggtta tggcagcact gcataattct 2160 
tctgtgactg gtgagtactc aaccaagtca 2220 
tgctcttgcc cggcgtcaat acgggataat 2280 
ctcatcattg gaaaacgttc ttcggggcga 2340 
tccagttcga tgtaacccac tcgtgcaccc 2400 
agcgtttctg ggtgagcaaa aacaggaagg 2460 
acacggaaat gttgaatact catactcttc 2520 
ggttattgtc tcatgagcgg atacatattt 2580 
gttccgcgca catttccccg aaaagtgcca 2 640 
acattaacct ataaaaatag gcgtatcacg 2700 

2713 



<400> 24 

ccttgcgcta atgctctgtt acagg 25 

<210> 25 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attPDWN Primer 
<400> 25 

cagaggcagg gagtgggaca aaattg 26 

<210> 26 
<211> 4346 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pSV40193attPsensePUR Plasmid 



<400> 26 

ccggtgccgc caccatcccc tgacccacgc 
atgaccgagt acaagcccac ggtgcgcctc 
cgcaccctcg ccgccgcgtt cgccgactac 
cgccacatcg agcgggtcac cgagctgcaa 
atcggcaagg tgtgggtcgc ggacgacggc 
agcgtcgaag cgggggcggt gttcgccgag 
tcccggctgg ccgcgcagca acagatggaa 
cccgcgtggt tcctggccac cgtcggcgtc 
agcgccgtcg tgctccccgg agtggaggcg 
gagacctccg cgccccgcaa cctccccttc 
gacgtcgagg tgcccgaagg accgcgcacc 
cgcccgcccc acgacccgca gcgcccgacc 
cgaagccgac ccgggcggcc ccgccgaccc 
gaggatcata atcagccata ccacatttgt 
acacctcccc ctgaacctga aacataaaat 
tgcagcttat aatggttaca aataaagcaa 
tttttcactg cattctagtt gtggtttgtc 
gatccgcgcc ggatccttaa ttaagtctag 
gcttggcgta atcatggtca tagctgtttc 
cacacaacat acgagccgga agcataaagt 
aactcacatt aattgcgttg cgctcactgc 
agctgcatta atgaatcggc caacgcgcgg 
ccgcttcctc gctcactgac tcgctgcgct 
ctcactcaaa ggcggtaata cggttatcca 



ccctgacccc tcacaaggag acgaccttcc 60 
gccacccgcg acgacgtccc ccgggccgta 120 
cccgccacgc gccacaccgt cgacccggac 180 
gaactcttcc tcacgcgcgt cgggctcgac 240 
gccgcggtgg cggtctggac cacgccggag 300 
atcggcccgc gcatggccga gttgagcggt 360 
ggcctcctgg cgccgcaccg gcccaaggag 420 
tcgcccgacc accagggcaa gggtctgggc 480 
gccgagcgcg ccggggtgcc cgccttcctg 540 
tacgagcggc tcggcttcac cgtcaccgcc 600 
tggtgcatga cccgcaagcc cggtgcctga 660 
gaaaggagcg cacgacccca tggctccgac 720 
cgcacccgcc cccgaggccc accgactcta 780 
agaggtttta cttgctttaa aaaacctccc 840 
gaatgcaatt gttgttgtta acttgtttat 900 
tagcatcaca aatttcacaa ataaagcatt 960 
caaactcatc aatgtatctt atcatgtctg 1020 
agtcgactgt ttaaacctgc aggcatgcaa 1080 
ctgtgtgaaa ttgttatccg ctcacaattc 1140 
gtaaagcctg gggtgcctaa tgagtgagct 1200 
ccgctttcca gtcgggaaac ctgtcgtgcc 1260 
ggagaggcgg tttgcgtatt gggcgctctt 1320 
cggtcgttcg gctgcggcga gcggtatcag 1380 
cagaatcagg ggataacgca ggaaagaaca 1440 
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tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 1500 
tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 1560 
gaaacccgac aggactataa agataccagg cgtttccccc tggaagct cc ctcgtgcgct 1620 
ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 1680 
tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 1740 
agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 1800 
atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 1860 
acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 1920 
actacggcta cactagaagg acagtatttg gtatctgcgc tctgctgaag ccagttacct 1980 
tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 2040 
tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 2100 
tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 2160 
tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 2220 
caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg 2280 
cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc cccgtcgtgt 2340 
agataactac gatacgggag ggcttaccat ctggccccag tgctgcaatg ataccgcgag 2400 
acccacgctc accggctcca gatttatcag caataaacca gccagccgga agggccgagc 2460 
gcagaagtgg tcctgcaact ttatccgcct ccatccagtc tattaattgt tgccgggaag 2520 
ctagagtaag tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt gctacaggca 2580 
tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag ctccggttcc caacgatcaa 2640 
ggcgagttac atgatccccc atgttgtgga aaaaagcggt tagctccttc ggtcctccga 2700 
tcgttgtcag aagtaagttg gccgcagtgt tatcactcat ggttatggca gcactgcata 2760 
attctcttac tgtcatgcca tccgtaagat gcttttctgt gactggtgag tactcaacca 2820 
agtcattctg agaatagtgt atgcggcgac cgagttgctc ttgcccggcg tcaatacggg 2880 
ataataccgc gccacatagc agaactttaa aagtgctcat cattggaaaa cgttcttcgg 2940 
ggcgaaaact ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa cccactcgtg 3000 
cacccaactg atcttcagca tcttttactt tcaccagcgt ttctgggtga gcaaaaacag 3060 
gaaggcaaaa tgccgcaaaa aagggaataa gggcgacacg gaaatgttga atactcatac 3120 
tcttcctttt tcaatattat tgaagcattt atcagggtta ttgtctcatg agcggataca 3180 
tatttgaatg tatttagaaa aataaacaaa taggggttcc gcgcacattt ccccgaaaag 3240 
tgccacctga cgtctaagaa accattatta tcatgacatt aacctataaa aataggcgta 3300 
tcacgaggcc ctttcgtctc gcgcgtttcg gtgatgacgg tgaaaacctc tgacacatgc 3360 
agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga caagcccgtc 3420 
agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct taactatgcg gcatcagagc 3480 
agattgtact gagagtgcac catatgcggt gtgaaatacc gcacagatgc gtaaggagaa 3540 
aataccgcat caggcgccat tcgccattca ggctgcgcaa ctgttgggaa gggcgatcgg 3600 
tgcgggcctc ttcgctatta cgccagctgg cgaaaggggg atgtgctgca aggcgattaa 3660 
gttgggtaac gccagggttt tcccagtcac gacgttgtaa aacgacggcc agtgaattcg 3720 
agctgtggaa tgtgtgtcag ttagggtgtg gaaagtcccc aggctcccca gcaggcagaa 3780 
gtatgcaaag catgcatctc aattagtcag caaccaggtg tggaaagtcc ccaggctccc 3840 
cagcaggcag aagtatgcaa agcatgcatc tcaattagtc agcaaccata gtcccgcccc 3900 
taactccgcc catcccgccc ctaactccgc ccagttccgc ccattctccg ccccatggct 3960 
gactaatttt ttttatttat gcagaggccg aggccgcctc ggcctctgag ctattccaga 4020 
agtagtgagg aggctttttt ggaggctcgg tacccccttg cgctaatgct ctgttacagg 4080 
tcactaatac catctaagta gttgattcat agtgactgca tatgttgtgt tttacagtat 4140 
tatgtagtct gttttttatg caaaatctaa tttaatatat tgat'atttat atcattttac 4200 
gtttctcgtt cagctttttt atactaagtt ggcattataa aaaagcattg cttatcaatt 4260 
tgttgcaacg aacaggtcac tatcagtcaa aataaaatca ttatttgatt tcaattttgt 4320 
cccactccct gcctctgggg ggcgcg 4346 

<210> 27 
<211> 5855 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCXLamlntR Plasmid 
<400> 27 

gtcgacattg attattgact agttattaat agtaatcaat tacggggtca ttagttcata 60 
gcccatatat ggagttccgc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 120 
ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt tcccatagta acgccaatag 180 
ggactttcca ttgacgtcaa tgggtggact atttacggta aactgcccac ttggcagtac 240 
atcaagtgta tcatatgcca agtacgcccc ctattgacgt caatgacggt aaatggcccg 300 
cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag tacatctacg 360 
tattagtcat cgctattacc atgggtcgag gtgagcccca cgttctgctt cactctcccc 420 
atctcccccc cctccccacc cccaattttg tatttattta ttttttaatt attttgtgca 480 
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gcgatggggg cggggggggg gggggcgcgc 
gggcggggcg aggcggagag gtgcggcggc 
tccttttatg gcgaggcggc ggcggcggcg 
gggagtcgct gcgttgcctt cgccccgtgc 
ccggctctga ctgaccgcgt tactcccaca 
gggctgtaat tagcgcttgg tttaatgacg 
ccttaaaggg ctccgggagg gccctttgtg 
tgtgtgtgtg cgtggggagc gccgcgtgcg 
cgggcgcggc gcggggcttt gtgcgctccg 
ggtgccccgc ggtgcggggg ggctgcgagg 
tgggggggtg agcagggggt gtgggcgcgg 
cctccccgag ttgctgagca cggcccggct 
gcggggctcg ccgtgccggg cggggggtgg 
ccgcctcggg ccggggaggg ctcgggggag 
gtcgaggcgc ggcgagccgc agccattgcc 
gacttccttt gtcccaaatc tggcggagcc 
tagcgggcgc gggcgaagcg gtgcggcgcc 
cgtgcgtcgc cgcgccgccg tccccttctc 
acggctgcct tcggggggga cggggcaggg 
gctctagagc ctctgctaac catgttcatg 
acgtgctggt tgttgtgctg tctcatcatt 
gtcatgagcg ccgggafctta ccccctaacc 
acagggaccc aaggacgggt aaagagtttg 
ctgaagctat acaggccaac attgagttat 
cgagaatcaa cagtgataat tccgttacgt 
tcctggccag cagaggaatc aagcagaaga 
caataaggag gggtctgcct gatgctccac 
caatgctcaa tggatacata gacgagggca 
cactgagcga tgcattccga gaggcaatag 
ctgccactcg cgcagcaaaa tctagagtaa 
tgaaaattta tcaagcagca gaatcatcac 
ctgttgttac cgggcaacga gttggtgatt 
atggatatct ttatgtcgag caaagcaaaa 
tgcatattga tgctctcgga atatcaatga 
ttggcggaga aaccataatt gcatctactc 
caaggtattt tatgcgcgca cgaaaagcat 
cctttcacga gttgcgcagt ttgtctgcaa 
ttgctcaaca tcttctcggg cataagtcgg 
gaggcaggga gtgggacaaa attgaaatca 
cctatcagaa ggtggtggct ggtgtggcca 
tttttccctc tgccaaaaat tatggggaca 
gctaataaag gaaatttatt ttcattgcaa 
tcggaaggac atatgggagg gcaaatcatt 
gtttggcaac atatgccata tgctggctgc 
cagtatatga aacagccccc tgctgtccat 
ggttagattt tttttatatt ttgttttgtg 
tccttacatg ttttactagc cagatttttc 
gtccctcttc tcttatgaag atccctcgac 
atagctgttt cctgtgtgaa attgttatcc 
aagcataaag tgtaaagcct ggggtgccta 
gcgctcactg cccgctttcc agtcgggaaa 
tagtcagcaa ccatagtccc gcccctaact 
tccgcccatt ctccgcccca tggctgacta 
gcctcggcct ctgagctatt ccagaagtag 
tgcaaaaagc taacttgttt attgcagctt 
caaatttcac aaataaagca tttttttcac 
tcaatgtatc ttatcatgtc tggatccgct 
aggcggtttg cgtattgggc gctcttccgc 
cgttcggctg cggcgagcgg tat cage tea 
atcaggggat aacgeaggaa agaacatgtg 
taaaaaggee gcgttgctgg cgtttttcca 
aaatcgaege tcaagtcaga ggtggcgaaa 
tccccctgga agctccctcg tgcgctctcc 
gtccgccttt ctcccttcgg gaagcgtggc 
cagttcggtg taggtegtte gctccaagct 
cgaccgctgc gccttatccg gtaactatcg 
atcgccactg gcagcagcca ctggtaacag 



-32- 

gecaggeggg gcggggcggg gcgaggggcg 540 
agecaatcag agcggcgcgc tccgaaagtt 600 
gecctataaa aagcgaagcg cgcggcgggc 660 
cccgctccgc gccgcctcgc gccgcccgcc 720 
ggtgagcggg cgggacggcc cttctcctcc 780 
getegtttet tttctgtggc tgcgtgaaag 840 
egggggggag eggctegggg ggtgcgtgcg 900 
gcccgcgctg cccggcggct gtgagcgctg 960 
cgtgtgcgcg aggggagege ggceggggge 102 0 
ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080 
eggteggget gtaacccccc cctgcacccc 1140 
tegggtgegg ggctccgtgc ggggcgtggc 1200 
cggcaggtgg gggtgccggg eggggegggg 1260 
gggcgcggcg gccccggagc gccggcggct 1320 
ttttatggta ategtgegag agggegcagg 1380 
gaaatctggg aggcgccgcc gcaccccctc 1440 
ggcaggaagg aaatgggcgg ggagggcett 1500 
catctccagc cteggggctg ccgcaggggg 1560 
eggggttegg cttctggcgt gtgaccggcg 1620 
ccttcttctt tttcctacag ctcctgggca 1680 
ttggcaaaga attcatggga agaaggegaa 1740 
tttatataag aaacaatgga tattactget 1800 
gattaggcag agacaggega ategcaatea 1860 
tttcaggaca caaacacaag cctctgacag 1920 
tacattcatg gettgatege tacgaaaaaa 1980 
cactcataaa ttacatgagc aaaattaaag 2040 
ttgaagacat caccacaaaa gaaattgegg 2100 
aggeggegtc agecaagtta atcagatcaa 2160 
ctgaaggeca tataacaaca aaccatgtcg 2220 
ggagatcaag acttaegget gacgaatacc 2280 
catgttggct cagacttgea atggaactgg 2340 
tatgegaaat gaagtggtct gatategtag 2400 
caggegtaaa aattgecate ccaacagcat 2460 
aggaaacact tgataaatgc aaagagattc 2520 
gtcgcgaacc gctttcatcc ggcacagtat 2580 
caggtctttc cttcgaaggg gatccgccta 2640 
gactctatga gaagcagata agegataagt 2700 
acaccatggc atcacagtat cgtgatgaca 2760 
aataagaatt cactcctcag gtgcaggctg 2820 
atgccctggc tcacaaatac cactgagatc 2880 
teatgaagee ccttgagcat ctgacttctg 2940 
tagtgtgttg gaattttttg tgtctctcac 3 000 
taaaacatca gaatgagtat ttggtttaga 3060 
catgaacaaa ggtggctata aagaggtcat 3120 
tccttattcc atagaaaagc cttgacttga 3180 
ttattttttt ctttaacatc cctaaaattt 3240 
ctcctctcct gactactccc agtcatagct 3300 
ctgcagccca agcttggcgt aatcatggtc 3360 
gctcacaatt ccacacaaca tacgagcegg 3420 
atgagtgagc taactcacat taattgcgtt 3480 
cctgtcgtgc cagcggatcc gcatctcaat 3540 
ccgcccatcc cgcccctaac tccgcccagt 3600 
atttttttta tttatgeaga ggccgaggcc 3660 
tgaggaggct tttttggagg cctaggcttt 3720 
ataatggtta caaataaagc aatagcatca 3780 
tgcattctag ttgtggtttg tccaaactca 3 840 
gcattaatga atcggccaac gegeggggag 3 900 
ttcctcgctc actgactcgc tgcgctcggt 3960 
etcaaaggeg gtaatacggt tatccacaga 4020 
agcaaaaggc cagcaaaagg ccaggaaccg 4080 
taggctccgc ccccctgacg agcatcacaa 4140 
cccgacagga ctataaagat accaggegtt 4200 
tgttccgacc ctgccgctta ccggatacct 4260 
gctttctcaa tgctcacgct gtaggtatct 4320 
gggctgtgtg cacgaacccc ccgttcagcc 4380 
tcttgagtcc aacceggtaa gacacgactt 4440 
gattagcaga gcgaggtatg taggcggtgc 4500 
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tacagagttc ttgaagtggt ggcctaacta cggctacact agaaggacag tatttggtat 4560 
ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 4620 
acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 4680 
aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga 4740 
aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct 4800 
tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga 4860 
cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat ttcgttcatc 4920 
catagttgcc tgactccccg tcgtgtagat aactacgata cgggagggct taccatctgg 4980 
ccccagtgct gcaatgatac cgcgagaccc acgctcaccg gctccagatt tatcagcaat 5040 
aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaactttat ccgcctccat 5100 
ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta atagtttgcg 5160 
caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg gtatggcttc 5220 
attcagctcc ggttcccaac gatcaaggcg agttacatga tcccccatgt tgtgcaaaaa 5280 
agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg cagtgttatc 5340 
actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg taagatgctt 5400 
ttctgtgact ggtgagtact caaccaagtc attctgagaa tagtgtatgc ggcgaccgag 5460 
ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa ctttaaaagt 5520 
gctcatcatt ggaaaacgtt cttcggggcg aaaactctca aggatcttac cgctgttgag 5580 
atccagttcg atgtaaccca ctcgtgcacc caactgatct tcagcatctt ttactttcac 5640 
cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc 5700 
gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca 5760 
gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg 5820 
ggttccgcgc acatttcccc gaaaagtgcc acctg 5855 

<210> 28 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> 5PacSV40 Primer 
<400> 28 

ctgttaatta actgtggaat gtgtgtcagt tagggtg 37 

<210> 29 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Antisense Zeo Primer 
<400> 29 

tgaacagggt cacgtcgtcc 20 

<210> 30 
<211> 1032 
<212> DNA 

<213> Escherichia Coli 

<220> 

<221> CDS 

<222> (1) . . . (1032) 

<223> nucleotide sequence encoding Cre recombinase 
<400> 30 

atg tec aat tta ctg acc gta cac caa aat ttg cct gca tta ccg gtc 48 
Met Ser Asn Leu Leu Thr Val His Gin Asn Leu Pro Ala Leu Pro Val 
15 10 15 

gat gca acg agt gat gag gtt cgc aag aac ctg atg gac atg ttc agg 96 
Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 
20 25 30 

gat cgc cag gcg ttt tct gag cat acc tgg aaa atg ctt ctg tec gtt 144 
Asp Arg Gin Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 
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35 40 45 

tgc egg teg tgg gcg gca tgg tgc aag ttg aat aac egg aaa tgg ttt 192 
Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 
50 55 60 

ccc gca gaa cct gaa gat gtt cgc gat tat ctt eta tat ctt cag gcg 240 
Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gin Ala 
65 70 75 80 

cgc ggt ctg gca gta aaa act ate cag caa cat ttg ggc cag eta aac 2 88 
Arg Gly Leu Ala Val Lys Thr lie Gin Gin His Leu Gly Gin Leu Asn 
85 90 95 

atg ctt cat cgt egg tec ggg ctg cca cga cca agt gac age aat get 336 
Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 
100 105 110 

gtt tea ctg gtt atg egg egg ate cga aaa gaa aac gtt gat gec ggt 384 
Val Ser Leu Val Met Arg Arg lie Arg Lys Glu Asn Val Asp Ala Gly 
115 120 125 

gaa cgt gca aaa cag get eta gcg ttc gaa cgc act gat ttc gac cag 432 
Glu Arg Ala Lys Gin Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gin 
130 135 140 

gtt cgt tea etc atg gaa aat age gat cgc tgc cag gat ata cgt aat 480 
Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gin Asp lie Arg Asn 
145 150 155 160 

ctg gca ttt ctg ggg att get tat aac ace ctg tta cgt ata gee gaa 52 8 
Leu Ala Phe Leu Gly lie Ala Tyr Asn Thr Leu Leu Arg He Ala Glu 
165 170 175 

att gec agg ate agg gtt aaa gat ate tea cgt act gac ggt ggg aga 576 
He Ala Arg He Arg Val Lys Asp He Ser Arg Thr Asp Gly Gly Arg 
180 185 190 

atg tta ate cat att ggc aga acg aaa acg etg gtt age acc gca ggt 624 
Met Leu He His He Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 
195 200 205 

gta gag aag gca ctt age ctg ggg gta act aaa ctg gtc gag cga tgg 672 
Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 
210 215 220 

att tec gtc tct ggt gta get gat gat ccg aat aac tac ctg ttt tgc 720 
He Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 
225 230 235 240 

egg gtc aga aaa aat ggt gtt gec gcg cca tct gec acc age cag eta 768 
Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gin Leu 
245 250 255 

tea act cgc gec ctg gaa ggg att ttt gaa gca act cat cga ttg att 816 
Ser Thr Arg Ala Leu Glu Gly He Phe Glu Ala Thr His Arg Leu He 
260 265 270 

tac ggc get aag gat gac tct ggt cag aga tac ctg gec tgg tct gga 864 
Tyr Gly Ala Lys Asp Asp Ser Gly Gin Arg Tyr Leu Ala Trp Ser Gly 
275 280 285 

cac agt gee cgt gtc gga gec gcg cga gat atg gee cgc get gga gtt 912 
His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 
290 295 300 

tea ata ccg gag ate atg caa get ggt ggc tgg acc aat gta aat att 960 
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Ser lie 


Pro 


Glu 


He 


Met 


Gin 


Ala 


Gly 


305 








310 






gtc atg 


aac 


tat 


ate 


cgt 


aac 


ctg 


gat 


Val Met 


Asn 


Tyr 


He 


Arg 


Asn 


Leu 


Asp 








325 










cgc ctg 


ctg 


gaa 


gat 


ggc 


gat 


tag 




Arg Leu 


Leu 


Glu 


Asp 


Gly 


Asp 


* 








340 












<210> 31 














<211> 343 














<212> PRT 














<213> Escherichia Coli 








<400> 31 














Met Ser 


Asn 


Leu 


Leu 


Thr 


Val 


His 


Gin 


1 






5 










Asp Ala 


Thr 


Ser 


ASp 


CjXU 


Val 


Arg 


Lys 






20 










25 


Asp Arg 


Gin 


a. j.a 


jrne 


Ser 


Glu 


His 


Thr 




35 










40 




Cys Arg 


Ser 


Trp 


Til <n 

ax a 


ax a 


Trp 


Cys 


Lys 


50 










55 






Pro Ala 


Glu 


Pro 




Asp 


Val 


Arg 


Asp 


65 








70 








Arg Gly 


Leu 


ax a 


vax 




Thr 


He 


Gin 








85 










Met Leu 


His 


Arg 


Arg 


Ser 


Gly 


Leu 


Pro 






100 










105 


Val Ser 


Leu 


vai 


jyjet 


Arg 


Arg 


He 


Arg 




115 










120 




Glu Arg 


Ala 


Lys 


m n 

17 J. 11 




Leu 


Ala 


Phe 


130 








135 






Val Arg 


Ser 


Leu 


Met 


Glu 


Asn 


Ser 


Asp 


145 








150 








Leu Ala 


Phe 


Leu 


Gly 


He 


Ala 


Tyr 


Asn 








165 










He Ala 


Arg 


He 


Arg 


Val 


Lys 


Asp 


He 






180 










185 


Met Leu 


He 


His 


He 


Gly 


Arg 


Thr 


Lys 




195 










200 




Val Glu 


Lys 


Ala 


Leu 


Ser 


Leu 


Gly 


Val 


210 










215 






He Ser 


Val 


Ser 


Gly 


Val 


Ala 


Asp 


Asp 


225 








230 








Arg Val 


Arg 


Lys 


Asn 


Gly 


Val 


Ala 


Ala 








245 










Ser Thr 


Arg 


Ala 


Leu 


Glu 


Gly 


He 


Phe 






260 










265 


Tyr Gly 


Ala 


Lys 


Asp 


Asp 


Ser 


Gly 


Gin 




275 










280 




His Ser 


Ala 


Arg 


Val 


Gly 


Ala 


Ala 


Arg 


290 










295 






Ser He 


Pro 


Glu 


He 


Met 


Gin 


Ala 


Gly 


305 








310 








Val Met 


Asn 


Tyr 


He 


Arg 


Asn 


Leu 


Asp 








325 










Arg Leu 


Leu 


Glu 


Asp 


Gly 


Asp 







340 



<210> 32 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
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Gly Trp 


Thr 


Asn 


Val 


Asn 


He 


315 










320 


agt gaa 


aca 


ggg 


gca 


atg 


gtg 


Ser Glu 


Thr 


Gly 


Ala 


Met 


Val 


330 






335 




Asn Leu 


Pro 


Ala 


Leu 


Pro 


Val 


10 








15 




Asn Leu 


Met 


Asp 


Met 


Phe 


Arg 








30 






Trp Lys 


Met 


Leu 


Leu 


Ser 


Val 






45 








Leu Asn 


Asn 


Arg 


Lys 


Trp 


Phe 




60 










Tyr Leu 


Leu 


Tyr 


Leu 


Gin 


Ala 


75 








80 


Gin His 


Leu 


Gly 


Gin 


Leu 


Asn 


90 








95 




Arg Pro 


Ser 


Asp 


Ser 


Asn 


Ala 








110 






Lys Glu 


Asn 


Val 


Asp 


Ala 


Gly 






125 








Glu Arg 


Thr 


Asp 


Phe 


Asp 


Gin 




140 










Arg Cys 


Gin 


Asp 


He 


Arg 


Asn 


155 










160 


Thr Leu 


Leu 


Arg 


He 


Ala 


Glu 


170 








175 




Ser Arg 


Thr 


Asp 


Gly 


Gly 


Arg 








190 






Thr Leu 


Val 


Ser 


Thr 


Ala 


Gly 






205 








Thr Lys 


Leu 


Val 


Glu 


Arg 


Trp 




220 










Pro Asn 


Asn 


Tyr 


Leu 


Phe 


Cys 


235 










240 


Pro Ser 


Ala 


Thr 


Ser 


Gin 


Leu 


250 








255 




Glu Ala 


Thr 


His 


Arg 


Leu 


He 








270 






Arg Tyr 


Leu 


Ala 


Trp 


Ser 


Gly 






285 








Asp Met 


Ala 


Arg 


Ala 


Gly 


Val 




300 










Gly Trp 


Thr 


Asn 


Val 


Asn 


He 


315 










320 


Ser Glu 


Thr 


Gly 


Ala 


Met 


Val 


330 








335 





1032 
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<220> 

<223> attBl recognition sequence 
<400> 32 

tgaagcctgc ttttttatac taacttgagc gaa 33 

<210> 33 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-att recognition sequence 

<221> misc_diff erence 
<222> 18 

<223> n is a or g or c or t/u 
<400> 33 

rkycwgcttt yktrtacnaa stsgb 25 

<210> 34 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attB recognition sequence 

<221> misc_diff erence 
<222> 18 

<223> n is a or c or g or t/u 
<400> 34 

agccwgcttt yktrtacnaa ctsgb 25 

<210> 35 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attR recognition sequence 

<221> misc_dif f erence 
<222> 18 

<223> n is a or g or c or t/u 
<400> 35 

gttcagcttt cktrtacnaa ctsgb 25 

<210> 36 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attL recognition sequence 

<221> mi sc_diff erence 
<222> 18 

<223> n is a or g or c or t/u 
<400> 36 

agccwgcttt cktrtacnaa gtsgb 25 



<210> 37 
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<211> 25 
<212> DNA 

<213> Artificial Sequence 
<:220> 

<223> m-attPl recognition sequence 

<221> misc_dif ference 
<222> 18 

<223> n is a or g or c or t/u 
<400> 37 

gttcagcttt yktrtacnaa gtsgb 25 

<210> 38 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB2 recognition sequence 



<210> 39 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB3 recognition sequence 
<400> 39 

acccagcttt cttgtacaaa cttgt 25 

<210> 40 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attRl recognition sequence 



<210> 41 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR2 recognition sequence 
<400> 41 

gttcagcttt cttgtacaaa cttgt 25 

<210> 42 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR3 recognition sequence 



<400> 38 

agcctgcttt cttgtacaaa cttgt 



25 



<400> 40 

gttcagcttt tttgtacaaa cttgt 



25 



<400> 42 
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gttcagcttt cttgtacaaa gttgg 

<210> 43 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attLX recognition sequence 
<400> 43 

agcctgcttt tttgtacaaa gttgg 

<210> 44 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attL2 recognition sequence 
<400> 44 

agcctgcttt cttgtacaaa gttgg 

<210> 45 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> attL3 recognition sequence 
<400> 45 

acccagcttt cttgtacaaa gttgg 

<210> 46 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attPl recognition sequence 
<400> 46 

gttcagcttt tttgtacaaa gttgg 

<210> 47 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP2,P3 recognition sequence 
<400> 47 

gttcagcttt cttgtacaaa gttgg 

<210> 48 
<211> 282 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP recognition sequence 



-38- 

25 



25 



25 



25 



25 



<400> 48 

ccttgcgcta atgctctgtt acaggtcact aataccatct aagtagttga ttcatagtga 60 
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ctgcatatgt tgtgttttac agtattatgt agtctgtttt ttatgcaaaa tctaatttaa 12 0 

tatattgata tttatatcat tttacgtttc tcgttcagct tttttatact aagttggcat 180 

tataaaaaag cattgcttat caatttgttg caacgaacag gtcactatca gtcaaaataa 240 

aatcattatt tgatttcaat tttgtcccac tccctgcctc tg 282 

<210> 49 
<211> 1071 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> nucleotide sequence encoding Integrase E174R 

<221> CDS 

<222> (1) . . . (1071) 

<223> Integrase E174R 



<400> 49 

atg gga aga agg cga agt cat gag cgc egg gat tta ccc cct aac ctt 

Met Gly Arg Arg Arg Ser His Glu Arg Arg Asp Leu Pro Pro Asn Leu 
15 10 15 

tat ata aga aac aat gga tat tac tgc tac agg gac cca agg acg ggt 
Tyr He Arg Asn Asn Gly Tyr Tyr Cys Tyr Arg Asp Pro Arg Thr Gly 
20 25 30 



aca gcg aga ate aac agt gat aat tec gtt acg tta cat tea tgg ctt 
Thr Ala Arg He Asn Ser Asp Asn Ser Val Thr Leu His Ser Trp Leu 
65 70 75 80 

gat cgc tac gaa aaa ate ctg gec age aga gga ate aag cag aag aca 
Asp Arg Tyr Glu Lys He Leu Ala Ser Arg Gly He Lys Gin Lys Thr 
85 90 95 

etc ata aat tac atg age aaa att aaa gca ata agg agg ggt ctg cct 
Leu He Asn Tyr Met Ser Lys He Lys Ala He Arg Arg Gly Leu Pro 
100 105 HO 

gat get cca ctt gaa gac ate acc aca aaa gaa att gcg gca atg etc 
Asp Ala Pro Leu Glu Asp He Thr Thr Lys Glu He Ala Ala Met Leu 
115 120 125 



48 



96 



aaa gag ttt gga tta ggc aga gac agg cga ate gca ate act gaa get . 144 
Lys Glu Phe Gly Leu Gly Arg Asp Arg Arg He Ala He Thr Glu Ala 
35 40 45 

ata cag gee aac att gag tta ttt tea gga cac aaa cac aag cct ctg 192 
He Gin Ala Asn He Glu Leu Phe Ser Gly His Lys His Lys Pro Leu 
50 55 60 



240 



288 



336 



384 



aat gga tac ata gac gag ggc aag gcg gcg tea gee aag tta ate aga 432 
Asn Gly Tyr He Asp Glu Gly Lys Ala Ala Ser Ala Lys Leu He Arg 
130 135 140 

tea aca ctg age gat gca ttc cga gag gca ata get gaa ggc cat ata 480 
Ser Thr Leu Ser Asp Ala Phe Arg Glu Ala He Ala Glu Gly His He 
145 150 155 160 

aca aca aac cat gtc get gec act cgc gca gca aaa tct aga gta agg 528 
Thr Thr Asn His Val Ala Ala Thr Arg Ala Ala Lys Ser Arg Val Arg 
165 170 175 

aga tea aga ctt acg get gac gaa tac ctg aaa att tat caa gca gca 576 
Arg Ser Arg Leu Thr Ala Asp Glu Tyr Leu Lys He Tyr Gin Ala Ala 
180 185 190 

gaa tea tea cca tgt tgg etc aga ctt gca atg gaa ctg get gtt gtt 624 
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Glu Ser Ser Pro Cys Trp Leu Arg Leu Ala Met Glu Leu Ala Val Val 
195 200 205 

acc ggg caa cga gtt ggt gat tta tgc gaa atg aag tgg tct gat ate 672 
Thr Gly Gin Arg Val Gly Asp Leu Cys Glu Met Lys Trp Ser Asp lie 
210 215 220 

gta gat gga tat ctt tat gtc gag caa age aaa aca ggc gta aaa att 720 
Val Asp Gly Tyr Leu Tyr Val Glu Gin Ser Lys Thr Gly Val Lys lie 
225 230 235 240 

gec ate cca aca gca ttg cat att gat get etc gga ata tea atg aag 768 
Ala lie Pro Thr Ala Leu His lie Asp Ala Leu Gly lie Ser Met Lys 
245 250 255 

gaa aca ctt gat aaa tgc aaa gag att ctt ggc gga gaa acc ata att 816 
Glu Thr Leu Asp Lys Cys Lys Glu lie Leu Gly Gly Glu Thr lie He 
260 265 270 

gca tct act cgt cgc gaa ccg ctt tea tec ggc aca gta tea agg tat 864 
Ala Ser Thr Arg Arg Glu Pro Leu Ser Ser Gly Thr Val Ser Arg Tyr 
275 280 285 

ttt atg cgc gca cga aaa gca tea ggt ctt tec ttc gaa ggg gat ccg 912 
Phe Met Arg Ala Arg Lys Ala Ser Gly Leu Ser Phe Glu Gly Asp Pro 
290 295 300 

cct acc ttt cac gag ttg cgc agt ttg tct gca aga etc tat gag aag 960 
Pro Thr Phe His Glu Leu Arg Ser Leu Ser Ala Arg Leu Tyr Glu Lys 
305 310 315 320 

cag ata age gat aag ttt get caa cat ctt etc ggg cat aag teg gac 1008 
Gin He Ser Asp Lys Phe Ala Gin His Leu Leu Gly His Lys Ser Asp 
325 330 335 

acc atg gca tea cag tat cgt gat gac aga ggc agg gag tgg gac aaa 1056 
Thr Met Ala Ser Gin Tyr Arg Asp Asp Arg Gly Arg Glu Trp Asp Lys 
340 345 350 

att gaa ate aaa taa 1071 
He Glu lie Lys * 
355 



<210> 50 




























<211> 356 




























<212> PRT 




























<213> Artificial Sequence 




















<220> 






























<223> Integrase 


E174R 






















<400> 50 




























Met Gly 


Arg 


Arg 


Arg 


Ser 


His 


Glu 


Arg 


Arg 


Asp 


Leu 


Pro 


Pro 


Asn 


Leu 


1 






5 










10 










15 




Tyr He 


Arg 


Asn 


Asn 


Gly 


Tyr 


Tyr 


Cys 


Tyr 


Arg 


Asp 


Pro 


Arg 


Thr 


Gly 






20 










25 










30 






Lys Glu 


Phe 


Gly 


Leu 


Gly 


Arg 


Asp 


Arg 


Arg 


He 


Ala 


He 


Thr 


Glu 


Ala 


35 










40 










45 








He Gin 


Ala 


Asn 


He 


Glu 


Leu 


Phe 


Ser 


Gly 


His 


Lys 


His 


Lys 


Pro 


Leu 


50 










55 










60 










Thr Ala 


Arg 


He 


Asn 


Ser 


Asp 


Asn 


Ser 


Val 


Thr 


Leu 


His 


Ser 


Trp 


Leu 


65 






70 










75 










80 


Asp Arg 
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Glu 
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He 
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Ser 


Arg 


Gly 


He 


Lys 


Gin 


Lys 
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Met 
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Lys 


He 


Lys 


Ala 


He 


Arg 


Arg 


Gly 


Leu 


Pro 






100 










105 










110 






Asp Ala 


Pro 


Leu 


Glu 


Asp 


He 


Thr 


Thr 


Lys 


Glu 


He 


Ala 


Ala 


Met 


Leu 
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120 


Asn Gly 


Tyr 


lie Asp Glu Gly 


Lys 




130 
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Thr 


Leu 


Ser 


Asp 


Ala Phe 


Arg 
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His 


Val 


Ala Ala 


Thr 
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Arg 


Ser 


Arg 


Leu Thr Ala Asp 


Glu 








180 








Glu 


Ser 


Ser 


Pro 


Cys 


Trp Leu 


Arg 






195 








200 


Thr 


Gly 


Gin 


Arg 


Val 


Gly Asp 
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210 








215 




Val 


Asp 


Gly 


Tyr 


Leu 


Tyr Val 


Glu 


225 
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Ala 


lie 


Pro 


Thr 


Ala 


Leu His 


He 










245 
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Lys 


Cys Lys 


Glu 
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Glu Pro 
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Phe 
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290 
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Thr 
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His 
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lie 


Ser 
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Lys 


Phe Ala 


Gin 
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Tyr Arg 
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340 








lie 
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lie 
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355 
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Ala 
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He 
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He 
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He 
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Asp 
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AMENDED CLAIMS 

[received by the International Bureau on 24 December 2002 (24.12.02); 
original claims 3, 9, 16, 20, 35, 52, 56, 80, 101, 105, 107, 111, 116, 123 and 128-132 amended; 
remaining claims unchanged (17 pages)] 



1 . A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a cell comprising one or more plant 

chromosomes; and 
5 selecting a cell comprising an artificial chromosome that 

comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat 

region; 

10 repeats of a nucleic acid unit have common nucleic acid 

sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 

2. The method of claim 1 , wherein the artificial chromosome is 
15 predominantly made up of one or more repeat regions. 

3. The method of claim 1, wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates amplification of a 
region of a plant chromosome or that targets the nucleic acid to an 
amplifiable region of a plant chromosome. 

20 4. The method of claim 1 , wherein the nucleic acid introduced into 

the cell comprises one or more nucleic acids selected from the group 
consisting of rDNA, lambda phage DNA and satellite DNA. 

5. The method of claim 4, wherein the nucleic acid comprises 
plant rDNA. 

25 6. The method of claim 5, wherein the rDNA is from a plant 

selected from the group consisting of Arabidopsis, Nicotians, Sofanum, 
Lycopersicon, Daucus, Hordeum, Zea mays, Brassica, Triticum and Oryza. 

7. The method of claim 4, wherein the nucleic acid comprises 
animal rDNA. 

30 8. The method of claim 7, wherein the rDNA is mammalian rDNA. 
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9. The method of claim 4, wherein the nucleic acid comprises 
rDNA comprising a sequence of an intergenic spacer region. 

10. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a plant selected from the group consisting of Arabidopsis, 

5 So/anum, Lycopersicon, Hordeum, Zea, Oryza, rye, wheat, radish and mung 
bean. 

1 1 . The method of claim 1 , wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates identification of 
cells containing the nucleic acid. 

10 12. The method of claim 1 1, wherein the nucleic acid sequence 

encodes a fluorescent protein. 

13. The method of claim 12, wherein the protein is a green 
fluorescent protein. 

14. The method of claim 1, wherein the step of selecting a cell 
15 comprising an artificial chromosome comprises sorting of cells into which 

nucleic acid was introduced. 

1 5. The method of claim 1, wherein the step of selecting a cell 
comprising an artificial chromosome comprises fluorescent in situ 
hybridization (FISH) analysis of cells into which nucleic acid was introduced. 
20 16. The method of claim 1, wherein the one or more plant 

chromosomes contained in the cell is (are) selected from the group consisting 
of Arabidopsis, tobacco and Heiianthus chromosomes. 

17. The method of claim 16, wherein the cell is a plant protoplast. 

18. The method of claim 1, wherein the nucleic acid introduced into 
25 the cell comprises nucleic acid encoding a selectable marker. 

19. The method of claim 18, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, 
kanamycin, hygromycin, dihydrofolate or sulfonylurea. 

20. An isolated plant artificial chromosome comprising one or more 
30 repeat regions, wherein: 
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one or more nucleic acid units is (are) repeated in a repeat 

region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

5 the repeat region(s) contain substantially equivalent amounts of 

euchromatic and heterochromatic nucleic acid. 

21 . The plant artificial chromosome of claim 20, wherein the artificial 
chromosome is predominantly made up of one or more repeat regions. 

22. A plant cell comprising an artificial chromosome, wherein the 
10 artificial chromosome is produced by the method of claim 1 or claim 2. 

23. A method of producing a transgenic plant, comprising 
introducing the artificial chromosome of claim 20 or claim 21 into a plant cell. 

24. The method of claim 23, wherein the artificial chromosome 
comprises heterologous nucleic acid encoding a gene product. 

15 25. The method of claim 24, wherein the heterologous nucleic acid 

encodes a product selected from the group consisting of enzymes, antisense 
RNA, tRNA, rDNA, structural proteins, marker proteins, ligands, receptors, 
ribozymes, therapeutic proteins and biopharmaceutical proteins. 

26. The method of claim 24, wherein the heterologous nucleic acid 
20 encodes a product selected from the group consisting of vaccines, blood 

factors, antigens, hormones, cytokines, growth factors and antibodies. 

27. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that provides for resistance to diseases, insects, herbicides 
or stress in the plant. 

25 28. The method of claim 24, wherein the heterologous nucleic acid 

encodes a product that provides for an agronomically important trait in the 
plant. 

29. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that alters the nutrient utilization and/or improves the 
30 nutrient quality of the plant. 
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30. The method of claim 24, wherein the heterologous nucleic acid 
is contained within a bacterial artificial chromosome (BAC) or a yeast 
artificial chromosome (YAC). 

31. A method of identifying plant genes encoding particular traits, 
5 comprising: 

generating an artificial chromosome comprising euchromatic 
DNA from a first species of plant; 

introducing the artificial chromosome into a plant cell of a 
second species of plant; and 
10 detecting phenotypic changes in the plant cell comprising the 

artificial chromosome and/or a plant generated from the plant cell comprising 
the artificial chromosome. 

32. The method of claim 31, wherein the artificial chromosome is a 
plant artificial chromosome or a mammalian artificial chromosome. 

15 33. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a cell comprising one or more plant 
chromosomes; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
25 34. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a plant cell; and 
selecting a plant cell comprising a SATAC. 
35. The method of claim 31, wherein the artificial chromosome is a 
30 minichromosome produced by a method comprising: 
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introducing nucleic acid into a plant cell; and 
selecting a cell comprising a minichromosome comprising a 
neo-centromere and euchromatin. 

36. The method of any of claims 33-35, wherein the nucleic acid 
5 introduced into the plant cell comprises DNA encoding a selectable marker. 

37. The method of claim 36, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, 
kanamycin, hygromycin, dihydrofolate or sulfonylurea. 

38. The method of claim 31, wherein the artificial chromosome 
10 comprising euchromatic DNA from a first plant species is produced by a 

method comprising: 

introducing into a plant cell of a first plant species an artificial 
chromosome capable of undergoing homologous recombination with the DNA 
of the first plant species; 
1 5 selecting for a recombination event between the artificial chromosome 

and the DNA of the first plant species; and 

selecting an artificial chromosome comprising euchromatic DNA from 
the first plant species. 

39. The method of claim 31, wherein the artificial chromosome 
20 comprising euchromatic DNA from a first plant species is produced by a 

method comprising: 

introducing into a plant cell of a first species an artificial chromosome 
capable of undergoing site-specific recombination with the DNA of the first 
plant species; 

25 selecting for a site-specific recombination event between the artificial 

chromosome and the DNA of the first plant species, and 

selecting an artificial chromosome comprising euchromatic DNA from 
the first plant species. 

40. The method of claim 39, wherein the DNA of the plant cell of a 
30 first species is modified to comprise a site-specific recombination sequence. 
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41. The method of claim 39, wherein the artificial chromosome 
comprises a site-specific recombination sequence. 

42. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence 

5 and the artificial chromosome comprises a site-specific recombination 
sequence. 

43. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence 
and the artificial chromosome comprises a site-specific recombination 

10 sequence that is complementary to the site-specific recombination sequence 
of the plant cell of a first plant species. 

44. The method of claim 39, wherein the site-specific 
recombination is catalyzed by a recombinase enzyme. 

45. A method for producing an acrocentric plant chromosome, 
15 comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site into a first chromosome of a plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into a second chromosome of the plant cell; 
20 introducing a recombinase activity into the plant cell, wherein 

the activity catalyzes recombination between the first and second 
chromosomes and whereby an acrocentric plant chromosome is produced. 

46. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome. 

25 47. The method of claim 45, wherein the second nucleic acid is 

introduced into the distal end of the arm of the second chromosome. 

48. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome and 
the second nucleic acid is introduced into the distal end of the arm of the 

30 second chromosome. 
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49. A method for producing an acrocentric plant chromosome, 
comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site into the pericentric heterochromatin of a chromosome in a 
5 plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into the distal end of the chromosome, wherein the first and 
second recombination sites are located on the same arm of the chromosome; 

introducing a recombinase activity into the cell, wherein the 
10 activity catalyzes recombination between the first and second recombination 
sites in the chromosome and whereby an acrocentric plant chromosome is 
produced. 

50. A method for producing an acrocentric plant chromosome, 
comprising: 

15 introducing nucleic acid comprising a recombination site adjacent 

to nucleic acid encoding a selectable marker into a first plant cell; 

generating a first transgenic plant from the first plant cell; 
introducing nucleic acid comprising a promoter functional in a 
plant cell, a recombination site and a recombinase coding region in operative 
20 linkage into a second plant cell; 

generating a second transgenic plant from the second plant cell; 
crossing the first and second plants; 

obtaining plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker; and 
25 selecting a resistant plant that contains cells comprising an 

acrocentric plant chromosome. 

51 . The method of any of claims 45-50, wherein the DNA of the 
short arm of the acrocentric chromosome contains less than 5% euchromatic 
DNA. 

30 52. The method of claim 51, wherein the DNA of the short arm of the 

acrocentric chromosome contains less than 1 % euchromatic DNA. 
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53. The method of any of claims 45-50, wherein the short arm of the 
acrocentric chromosome does not contain euchromatic DNA. 

54. The method of any of claims 45-49, wherein the nucleic acid 
introduced into a chromosome comprises nucleic acid encoding a selectable 

5 marker. 

55. An acrocentric plant artificial chromosome, wherein the short arm 
of the acrocentric chromosome does not contain euchromatic DNA. 

56. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant acrocentric chromosome in a 

10 cell, wherein the short arm of the acrocentric chromosome does not contain 

euchromatic DNA; 

culturing the cell through at least one cell division; and 
selecting a cell comprising an artificial chromosome that is 

predominantly heterochromatic. 

57. The method of claim 56, wherein the acrocentric chromosome is 
produced by the method of any of claims 45-49. 

58. A method for producing an artificial chromosome, comprising: . 
introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
25 sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 

59. The method of claim 4, wherein the nucleic acid comprises plant 
rDNA from a dicot plant species. 

30 60. The method of claim 4, wherein the nucleic acid comprises plant 

rDNA from a monocot plant species. 
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61. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a Nicotiana plant. 

62. The method of claim 9, wherein the rDNA is plant rDNA. 

63. The method of claim 62, wherein the plant is a dicot plant 
species. 

64. The method of claim 62, wherein the plant is a monocot plant 
species. 

65. The method of claim 1, wherein the cell is a dicot plant cell. 

66. The method of claim 1, wherein the cell is a monocot plant cell. 
An isolated plant artificial chromosome comprising one or more 

repeat regions, wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
15 sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 

68. The method of claim 31, wherein the artificial chromosome is 
produced by a method comprising: 
20 introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

25 the common nucleic acid sequences comprise sequences that represent 

euchromatic and heterochromatic nucleic acid. 

69. The method of claim 44, wherein the recombinase is selected 
from the group consisting of a bacteriophage P1 Cre recombinase, a yeast R 
recombinase and a yeast FLP recombinase. 

30 70 * The method of claim 50, further comprising selecting first and 

second transgenic plants wherein: 
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one of the plants comprises a chromosome comprising a 
recombination site located on a short arm of the chromosome in a region 
adjacent to the pericentric heterochromatin; and 

the other plant comprises a chromosome comprising a 
5 recombination site located in rDNA of the chromosome. 

71. The method of claim 70, wherein the recombination sites on the 
two chromosomes are in the same orientation. 

72. A method for producing an acrocentric plant chromosome, 
comprising: 

10 introducing nucleic acid comprising two site-specific 

recombination sites into a cell comprising one or more plant chromosomes; 

introducing a recombinase activity into the cell, wherein the 

activity catalyzes recombination between the two recombination sites, whereby 

a plant acrocentric chromosome is produced. 
15 73. The method of claim 72, wherein the two site-specific 

recombination sites are contained on separate nucleic acid fragments. 

74. The method of claim 73, wherein the separate nucleic acid 
fragments are introduced into the cell simultaneously or sequentially. 

75. The method of claim 56, wherein the artificial chromosome is 
20 predominantly heterochromatic. 

76. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant chromosome in a cell, 

wherein the chromosome contains adjacent regions of rDNA and 
heterochromatic DNA; 
25 culturing the cell through at least one cell division; and 

selecting a cell comprising an artificial chromosome. 

77. The method of claim 76, wherein the artificial chromosome is 
predominantly heterochromatic. 

78. The method of claim 76 or claim 77, wherein the plant 
30 chromosome into which the nucleic acid is introduced is an acrocentric 

chromosome. 
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79. The method of claim 78, wherein the short arm of the 
chromosome contains adjacent regions of rDNA and heterochromatic DNA. 

80. The method of claim 76, 77, or 79, wherein the 
heterochromatic DNA is pericentric heterochromatin. 

5 81. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth 
of animal cells in the presence of an agent normally toxic to the animal cells; 
and wherein the agent is not toxic to plant cells; 
10 a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a 
region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome. 

82. The vector of claim 81, wherein the amplifiable region 
15 comprises heterochromatic nucleic acid. 

83. The vector of claim 81, wherein the amplifiable region 
comprises rDNA. 

84. The vector of claim 81, wherein the sequence of nucleotides 
that facilitates amplification of a region of a plant chromosome or targets the 

20 vector to an amplifiable region of a plant chromosome comprises a sufficient 
portion of an intergenic spacer region of rDNA to facilitate amplification or 
effect the targeting. 

85. The vector of claim 84, wherein the sufficient portion contains 
at least 14, 20, 30, 50, 100, 150, 300 or 500 contiguous nucleotides from 

25 an intergenic spacer region. 

86. The vector of claim 81, wherein the selectable marker encodes 
a product that confers resistance to zeomycin. 

87. A plant transformation vector, comprising: 
a recognition site for recombination; 

30 a sequence of nucleotides that facilitates amplification of a 
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region of a plant chromosome or targets the vector to an amplif iable region 
of a plant chromosome; and 

one or more selectable markers that when expressed in a plant 
cell permit the selection of the cell; wherein 
5 the plant transformation vector is for Agrobacter/um-med\ated 

transformation of plants. 

88. The vector of claim 81, wherein the recognition site comprises 
an att site. 

89. The vector claim 81 , that is pAglla or pAgllb. 
10 90. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits. growth 
of animal cells in the presence of an agent normally toxic to the animal cells; 
and wherein the agent is not toxic to plant cells; 
15 a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 

91. The vector of claim 90, wherein the recognition site comprises 
an att site. 

92. The vector of claim 90, further comprising a sequence of 

20 nucleotides that facilitates amplification of a region of a plant chromosome or 
targets the vector to an amplifiable region of a plant chromosome. 

93. The vector of claim 90, wherein the promoter is nopaline 
synthase (NOS) or CaMV35S. 

94. The vector of claim 93 that is pAg1 or pAg 2. 

25 95. The vector of claim 92, wherein the amplifiable region 

comprises heterochromatic nucleic acid. 

96. The vector of claim 92, wherein the amplifiable region 
comprises rDNA. 

97. The vector of claim 96, wherein the sequence of nucleotides 
30 that facilitates amplification of a region of a plant chromosome or targets the 
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vector to an amplifiable region of a plant chromosome comprises a sufficient 
portion of an intergenic spacer region of rDNA to effect the amplification or 
the targeting. 

98. The vector of claim 90, wherein the protein is a selectable 
5 marker that permits growth of plant cells in the presence of an agent 

normally toxic to the plant cells. 

99. The vector of claim 98, wherein the selectable marker confers 
resistance to hygromycin or to phosphothricin. 

100. The vector of claim 90, wherein the protein is a fluorescent 
10 protein. 

101. The vector of claim 100, wherein the fluorescent protein is 
selected from the group consisting of green, blue and red fluorescent proteins. 

102. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
1 5 associated with any promoter, wherein the selectable marker permits growth 
of plant cells in the presence of an agent normally toxic to the plant cells; and 
wherein the agent is not toxic to animal cells; 

a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 
20 103. A vector, comprising: 

a recognition site for recombination; and 
a sequence of nucleotides that facilitates amplification of a 
region of a plant chromosome or targets the vector to an amplifiable region of 
a plant chromosome, wherein the plant is selected from the group consisting 
25 of Arabidopsis, Nicotiana, Solanum, Lycopersicon, Daucus, Hordeum, Zea 
mays, Brassica, Triticum, Hef/anthus, Glycine, soybean, Gossypium, cotton, 
Heffanthus, sunflower and Oryza. 

104. The vector of claim 103, wherein the recognition site comprises 
an att site. 

30 105. A cell, comprising a vector of any of claims 81-86 and 88-104. 

106. The cell of claim 105 that is a plant cell. 
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107. A method, comprising: 

introducing a vector of claim 90 into a cell, wherein: 
the cell comprises an animal platform ACes that contains a recognition site 
that recombines with the recognition site in the vector in the presence of the 
5 recombinase therefor, thereby incorporating the selectable marker that is not 
operably associated with any promoter and the nucleic acid encoding a protein 
operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. 

108. The method of claim 107, wherein the recombination sites are 
10 att sites. 

109. The method of claim 107, wherein the animal is a mammal. 

1 10. The method of claim 107, wherein the platform ACes comprises 
a promoter that upon recombination is operably linked to the selectable 
marker that in the vector is not operably associated with a promoter. 

15 111. The method of any of claims 107-1 10, further comprising, 

transferring the resulting platform ACes into a plant cell to produce a plant 
ceil that comprises the platform Aces. 

1 1 2. The method of claim 111, wherein the resulting platform ACes 
is isolated prior to transfer. 

20 113. The method of claim 111, wherein the isolated ACes is 

introduced into a plant cell by a method selected from the group consisting of 
protoplast transfection, lipid-mediated delivery, liposomes, electroporation, 
sonoporation, microinjection, particle bombardment, silicon carbide whisker- 
mediated transformation, polyethylene glycol (PEG)-mediated DNA uptake, 

25 lipofection and lipid-mediated carrier systems. 

114. The method of claim 111, wherein the resulting platform ACes 
is transferred by fusion of the cells. 

115. The method of claim 111, wherein the cells are plant 
protoplasts. 

30 115. The method of claim 107, wherein the cell is an animal cell. 
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1 1 7. The method of claim 1 1 6, wherein the animal cell is a 
mammalian cell. 

1 1 8. The method of claim 111, further comprising culturing the plant 
cell that comprises the platform Aces under conditions whereby the protein 

5 encoded by the nucleic acid that is operably linked to a plant promoter is 
expressed. 

119. A method, comprising: 

introducing a vector of claim 81 into a plant cell; 
culturing the plant cells; and 
10 selecting a plant cell comprising an artificial chromosome that comprises 

one or more repeat regions. 

1 20. The method of claim 1 1 9, wherein sufficient portion of the vector 
integrates into a chromosome in the plant cell to result in amplification of 
chromosomal DNA. 

15 121. The method of claim 119 or claim 120, wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

20 the repeat region(s) contain substantially equivalent amounts of 

euchromatic and heterochromatic nucleic acid. 

122. The method of claim 119, further comprising isolating the 
artificial chromosome. 

123. A method, comprising: 

25 introducing a vector into a cell, wherein: 

i) the vector comprises: 

a) nucleic acid encoding a selectable marker that is 
not operably associated with any promoter, wherein the 
selectable marker permits growth of animal cells in the presence 
30 of an agent normally toxic to the animal cells; and wherein the 

agent is not toxic to plant cells; 
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b) a recognition site for recombination; and 

c) nucleic acid encoding a protein operably linked to 
an animal promoter; 

ii) the cell comprises: 
5 a platform plant artificial chromosome (PAC) that 

comprises a recombination site and an animal promoter that upon 
recombination is operably linked to the selectable marker that, in 
the vector, is not operably associated with a 

promoter; 

10 iii) introduction is effected under conditions whereby 

the vector recombines with the PAC to produce a plant platform PAC that 
contains the selectable marker operably linked to the promoter; and 

culturing the resulting cell under conditions, whereby the protein 
encoded by nucleic acid operably linked to an animal promoter is expressed. 

15 124. The method of claim 119, wherein the artificial chromosome is an 

ACes. 

125. The method of claim 123, wherein the plant platform PAC is an 

ACes. 

126. The method of claim 1, wherein the nucleic acid introduced into 
20 the cell comprises nucleic acid encoding a selectable marker. 

127. The vector of claim 81, further comprising one or more selectable 
markers that when expressed in the plant cell permit the selection of the cell. 

128. A method of producing a plant artificial chromosome, comprising: 
introducing the vector of claim 81, 87 or 127 into a cell 

25 comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that 
comprises one or more repeat regions; wherein 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid sequences; and 
30 the common nucleic acid sequences comprise sequences that 

represent euchromatic and heterochromatic nucleic acid. 
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129. A method of producing a plant artificial chromosome, comprising: 
introducing the vector of claim 81, 87 or 127 into a cell 

comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that 
5 comprises one or more repeat regions; wherein 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 

sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
10 euchromatic and heterochromatic nucleic acid. 

130. The method of claim 123, wherein the cell into which the vector 
is introduced is an animal cell. 

131. The method of claim 130, wherein the cell is a mammalian cell. 

132. The method of claim 78, wherein the heterochromatic DNA is 
15 pericentric heterochromatin. 
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PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS 
OF PREPARING PLANT ARTIFICIAL CHROMOSOMES 

RELATED APPLICATIONS 

Benefit of priority is claimed to U.S. Provisional Application No. 
5 60/294,687, filed May 30, 2001, by CARL PEREZ AND STEVEN 

FABIJANSKI entitled PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF 
AND METHODS FOR PREPARING PLANT ARTIFICIAL CHROMOSOMES and 
to U.S. Provisional Application No. 60/296,329, filed June 4, 2001, by CARL 
PEREZ AND STEVEN FABIJANSKI entitled PLANT ARTIFICIAL 

10 CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING PLANT 
ARTIFICIAL CHROMOSOMES. This application is related to U.S. Provisional 
Application No. 60/294,758, filed May 30, 2001, by EDWARD PERKINS et 
al.. entitled CHROMOSOME-BASED PLATFORMS and to U.S. Provisional 
Application No. 60/366,891, filed March 21, 2002, by by EDWARD 

15 PERKINS etal.. entitled CHROMOSOME-BASED PLATFORMS. This 

application is also related to U.S. Provisional Application Attorney Docket 
No. 24601-420, filed May 30, 2002, by EDWARD PERKINS etal.. entitled 
CHROMOSOME-BASED PLATFORMS and to PCT International Patent 
Application Attorney Docket No. 24601 -420PC, filed May 30, 2002, by 

20 EDWARD PERKINS etal.. entitled CHROMOSOME-BASED PLATFORMS. 
This application is related to U.S. application Serial No. 08/695,191, filed 
August 7, 1 996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES, now U.S. Patent No. 6,025,155. 

25 This application is also related to U.S. application Serial No. 08/682,080, 

filed July 15, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES, now U.S. Patent No. 6,077,697. 
This application is also related U.S. application Serial No. 08/629,822, filed 

30 April 10, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 



WO 2002/096923 



PCT/US2002/017451 



ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES (now abandoned), and is also 
related to copending U.S. application Serial No. 09/096,648, filed June 12, 
1998, by GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL 
5 CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING 

ARTIFICIAL CHROMOSOMES and to U.S. application Serial No. 09/835,682, 
April 10, 1997 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES (now abandoned). This 

10 application is also related to copending U.S. application Serial No. 
09/724,726, filed November 28, 2000, U.S. application Serial No. 
09/724,872, filed November 28, 2000, U.S. application Serial No. 
09/724,693, filed November 28, 2000, U.S. application Serial No. 
09/799,462, filed March 5, 2001, U.S. application Serial No. 09/836,911, 

15 filed April 17, 2001, and U.S. application Serial No. 10/125,767, filed April 
17, 2002, each of which is by GYULA HADLACZKY and ALADAR SZALAY, 
and is entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND 
METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES. This application 
is also related to International PCT application No. WO 97/40183. Where 

20 permitted the subject matter of each of these applications is incorporated by 
reference in its entirety. 
FIELD OF THE INVENTION 

Artificial chromosomes and methods of producing artificial 
chromosomes, particularly for use in delivery of nucleic acids and expression 

25 thereof in plants are provided. Also provided are methods of use of artificial 
chromosomes in the delivery of nucleic acids to host cells, including plant 
cells, and the expression of the nucleic acids therein. The resulting plant 
cells, tissues, organs and whole plants containing the artificial chromosomes, 
plant cell-based methods for production of heterologous proteins and 
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methods of producing transgenic organisms, particularly plants, using the 
artificial chromosomes are provided. 
BACKGROUND OF THE INVENTION 

The stable transfer of nucleic acids into plant cells and the expression 
5 of the nucleic acids therein poses many challenges. Many efforts at the 
stable introduction of nucleic acids into plant cells have utilized 
Agrobacterium-medlatedi transformation. Agrobacterium is a free-living 
Gram-negative soil bacterium. Virulent strains of this bacterium are able to 
infect plant tissue and induce the production of a neoplastic growth 

10 commonly referred to as a crowngall. Virulent strains of Agrobacterium 
contain a large plasmid DNA known as a Ti-plasmid that contains genes 
required for DNA transfer (vir genes) and replication as well as a region of 
DNA that is transferred to plant cells called T-DNA. The T-DNA region is 
bordered by T-DNA border sequences that are crucial to the DNA transfer 

15 process. These T-DNA border sequences are recognized by the vir genes 
encoded on the Ti-plasmid and the vir genes are responsible for the DNA 
transfer process. 

Most wild-type Agrobacterium have a relatively broad dicot plant host 
range and are capable of transferring T-DNA regions up to 25 kilobases of 

20 DNA {e.g., nopaline strains) or more [e.g., octopine strains). Accordingly, 
numerous methods of using Agrobacterium to transfer DNA into plant cells 
have been developed based on the engineering of the Ti-plasmid to no longer 
contain the genes responsible for altered morphology and replacing these 
genes with a recombinant gene encoding a trait of interest. There are two 

25 primary types of Agrobacterium-based plant transformation systems, binary 
[see, e.g., U.S. Patent No. 4,940,838] and co-integrate [see, e.g., Fraley et 
al. (1985) Biotechnology 3:629-635] methods. The T-DNA border repeats 
are maintained in both systems and the natural DNA transfer process is used 
to transfer the portion of DNA located between the T-DNA borders into the 

30 plant cell. 
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Another plant cell transformation system, termed biolistics, involves 
the bombardment of plant cells with microscopic particles coated with DNA 
encoding a new trait. The particles are rapidly accelerated, typically by gas 
or electrical discharge, through the cell wall and membranes, whereby the 
5 DNA is released into the cell and is incorporated into the genome of the cell. 
This method is used for transformation of many crops, including corn, wheat, 
barley, rice, woody tree species and others. 

A significant number of crop species of commercial interest have been 
transformed using either Agrobacten'um-mediated or biolistic systems. 

10 However, these methods have many limitations that limit their utility. For 
example, there are limits to the size of the heterologous DNA that can be 
transferred using these methods; typically, only one to two genes may be 
transferred. Thus, although these methods may have utility in producing 
crop products modified to contain a single new trait, such as insect or 

15 herbicide tolerance, they may not be sufficient to transfer DNA that will 
provide for multiple traits, or very large DNA segments encoding a 
multiplicity of traits. 

In addition, the genetically modified plant cells produced by these 
methods tend to contain the transferred DNA in euchromatic regions of the 

20 genomic DNA. Typically, a large number of independent transgenic insertion 
events must be screened before a suitable event (such as insertion of a gene 
into the host genomic DNA such that it provides a sufficient level of gene 
expression within temporal and spatial expectations and without evidence of 
gene rearrangement) is identified. 

25 Another limitation of these methods is the effort required to utilize 

them in the genetic modification of many commercially important crops. For 
example, transformation efficiency can vary with the crop and can be low, 
notably in cereal crops such as corn and wheat. Often the inserted genes 
are rearranged and unstable over generations. 
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Furthermore Agrobacterium tumefaciens relies on host-parasite 
interaction in order to be successful. This has the effect that Agrobacterium 
has a preference for some dicots, while other dicots, monocots and conifers 
are resistant to transformation via Agrobacterium. Self-replicating vectors 
5 have also been used in the transfer of nucleic acids into plant cells. Such 
episomal vectors contain DNA sequences that are required for DNA 
replication and sustainability of the vector in a living cell. In higher plants, 
very few episomal vectors have been developed. These episomal vectors 
have the drawback of having a very limited capacity for carrying genetic 

10 information and are unstable. One example of an episomal plant vector is 
the Cauliflower Mosaic Virus [Brisson et al. (1984) Nature 3/0:511]. 

Limitations of these gene delivery technologies necessitate the 
development of alternative vector systems suitable for transferring large (up 
to Mb size or larger) genes, gene complexes, and multiple genes together 

15 with regulatory elements for safe, controlled, and persistent expression of 
the desired genetic material in higher organisms, particularly plants, without 
rearrangement caused by insertion or mutagenesis. Therefore, it is an object 
herein to provide artificial chromosomes for the introduction of large nucleic 
acids into eukaryotic cells and methods using the artificial chromosomes, 

20 particularly for the introduction and expression of nucleic acids in plants. 
SUMMARY OF THE INVENTION 

Provided herein are plant artificial chromosomes and methods for 
producing plant artificial chromosomes. The artificial chromosomes are fully 
functional stable chromosomes. Plant artificial chromosomes provided herein 

25 have a particular composition that makes them ideal vectors for stable, 

controlled, high-level expression of heterologous nucleic acids in plant cells. 
The artificial chromosomes are capable of independent, extra-genomic 
maintenance, replication and segregation within cells and can carry multiple, 
large heterologous genes. 
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Artificial plant chromosomes provided herein are non-natural 
chromosomes that exhibit an ordered segmentation that distinguishes them 
from naturally occurring chromosomes. The segmented appearance can be 
visualized using a variety of chromosome analysis techniques and correlates 
5 with the unique structure of these artificial chromosomes, which, in 
particular methods of producing these chromosomes, can arise through 
amplification of chromosomal segments (i.e., amplification-based artificial 
chromosomes). The artificial chromosomes, throughout the region or regions 
of segmentation, are predominantly made up of one or more nucleic acid 

10 units that is (are) repeated in the region (referred to as the repeat region) and 
that have a similar gross structure. Repeats of a nucleic acid unit tend to be 
of similar size and share some common nucleic acid sequences, for example, 
a replication site involved in amplification of chromosome segments and/or 
some heterologous nucleic acid. Although the size of a repeating nucleic 

15 acid unit can vary, typically they tend to be greater than about 100 kb, 
greater than about 500 kb, greater than about 1 Mb, greater than about 5 
Mb or greater than about 10 Mb. Typically, repeats of a nucleic acid unit are 
substantially similar in nucleic acid composition and can be nearly identical. 
The common nucleic acid sequences can contain sequences that represent 

20 euchromatic and heterochromatic nucleic acid. The composition of the 

amplification-based artificial chromosomes can be such that substantially the 
entire chromosome exhibits a segmented appearance or such that only one 
or more portions that make-up less than the entire chromosome appear 
segmented. 

25 The composition of the plant artificial chromosomes provided herein 

can vary. For example, in some of the artificial chromosomes provided 
herein, the repeat region or regions can be made up predominantly of 
heterochromatic DNA (i.e., the repeat region or regions contain more 
heterochromatic DNA than other types of DNA, e.g., euchromatic DNA). In 

30 other artificial chromosomes provided herein, the repeat region or regions can 
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be made up predominantly of euchromatic DNA {i.e., the repeat region or 
regions contain more euchromatic DNA than other types of DNA, e.g., 
heterochromatic DNA) or can be made up of substantially equivalent 
amounts of heterochromatic and euchromatic DNA, e.g. , about 40% to 
5 about 50% of one type of nucleic acid and about 50% to about 60% of the 
other type of nucleic acid. The repeat region or regions thus can be entirely 
heterochromatic (while still containing one or more heterologous genes), or 
can contain increasing amounts of euchromatic DNA, such that, for example, 
the region contains about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 

10 90% or greater than 90% euchromatic DNA. Common nucleic acid 

sequences within repeated nucleic acid units in a repeat region can contain 
DNA that represents euchromatic nucleic acid and DNA that represents 
heterochromatic nucleic acid. Because the entire artificial chromosome can 
be made up predominantly of a repeat region or regions {e.g., the 

1 5 composition of the chromosome is such that the repeat region or regions 
make up greater than about 50% or greater than about 60% of the 
chromosome), it is thus possible for the artificial chromosome to be made up 
predominantly of heterochromatin or euchromatin, or to be made up of 
substantially equivalent amounts of heterochromatin and euchromatin, e.g., 

20 about 40% to about 50% of one type of nucleic acid and about 50% to 
about 60% of the other type of nucleic acid. Plant artificial chromosomes 
provided herein can be isolated or contained within cells or vesicles. 

Also provided herein are ceils containing plant artificial chromosomes 
as described herein, including plant cells and animal cells. Included among 

25 the cells containing the plant artificial chromosomes are any cells that include 
one or more plant chromosomes. Included, for example, are plant cells, 
including plant protoplasts, in culture and within plant tissues, organs, seeds, 
pollen or whole plants. Plant cells containing the plant artificial 
chromosomes can be from any type of plant, including monocots and dicots. 

30 For example, the plant cells can be from Arabidopsis, Nicotiana, Solatium, 
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Lycopersicon, Daucus, Hordeum, Zea mays, Brassica, Triticum, Helianthus, 
Oryza, Glycine (soybean), gossypium (cotton). Also contemplated are 
mammalian and other animal cells that contain plant ACs 

Plant cells containing artificial chromosomes of any species are also 
5 provided herein. Thus, for example, such plant cells can contain an artificial 
chromosome containing an animal, e.g., mammalian, centromere or an insect 
or avian centromere. Included among the artificial chromosomes contained 
within plant cells as provided herein are predominantly heterochromatic 
[formerly referred to as satellite artificial chromosomes (SATACs); see, e.g., 

10 U.S. Patent Nos. 6,077,697 and 6,025,155 and published International PCT 
application No. WO 97/40183], minichromosomes which contain a de novo 
centromere, artificial chromosomes containing one or more regions of 
repeating nucleic acid units wherein the repeat region(s) contain substantially 
equivalent amounts of euchromatic and heterochromatic nucleic acid and in 

15 vitro assembled artificial chromosomes, each from any species. An 
exemplary artificial chromosome is a mammalian satellite artificial 
chromosome containing a mouse centromere. Included among the plant cells 
containing artificial chromosomes of any species are plant cells, including 
plant protoplasts, in culture and within plant tissues, organs, seeds, pollen or 

20 whole plants. Plant cells containing the artificial chromosomes can be from 
any type of plant, including monocots and dicots. For example, the plant 
cells can be from Arabidopsis, Nicotiana, Solanum, Lycopersicon, Daucus, 
Hordeum, Zea mays, Brassica, Triticum, Heiianthus and Oryza. 

Further provided herein are methods of producing plant artificial 

25 chromosomes. One embodiment of these methods includes the steps of 
introducing nucleic acid into a cell containing plant chromosomes and 
selecting a cell containing an artificial chromosome that contains one or more 
repeat regions in which one or more nucleic acid units is (are) repeated. The 
repeats of a nucleic acid unit in a repeat region can contain common nucleic 

30 acid sequences and can be substantially identical. In some embodiments of 
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this method, the repeat region(s) of the artificial chromosome contain 
substantially equivalent amounts of euchromatic and heterochromatic nucleic 
acid. The artificial chromosome can be predominantly made up of one or 
more repeat regions. In further embodiments of this method, the artificial 
5 chromosome is made up of substantially equivalent amounts of euchromatic 
and heterochromatic nucleic acid. In further embodiments of this method, 
the repeats of a nucleic acid unit have common nucleic acid sequences 
which contain sequences that represent euchromatic and heterochromatic 
nucleic acid. 

10 Any cell containing plant chromosomes can be used in these 

embodiments of methods of producing plant artificial chromosomes described 
herein. For example, the cell can be any cell that contains chromosomes 
from Arabidopsis, tobacco, Solarium, Lycopersicon, Daucus, Hordeum, Zea 
mays, Brassica, Triticum, Oryza, Capsicum, lentil and/or Hefianthus, including 

15 cells or protoplasts of Arabidopsis, tobacco and/or Hefianthus. 

The nucleic acid that is introduced into a cell containing plant 
chromosomes in methods of producing a plant artificial chromosome as 
provided herein can be any nucleic acid, including, but not limited to, satellite 
DNA, rDNA and lambda phage DNA. Satellite DNA and rDNA includes such 

20 DNA from plants, such as, for example, Arabidopsis, Nicotiana, Solarium, 
Lycopersicon, Daucus, Hordeum, Zea mays, Brassica, Triticum and Oryza, 
and from animals, such as mammals. The rDNA can contain sequences of 
an intergenic spacer region, such as can be obtained, for example, from DNA 
of Arabidopsis, Solanum, Lycopersicon, Hordeum, Zea, Oryza, rye, wheat, 

25 radish and mung bean. In some embodiments of the method, the nucleic 

acid contains a nucleic acid sequence that facilitates amplification of a region 
of a plant chromosome or targets it to an amplifiable region of a plant 
chromosome. 

In further embodiments of methods of producing plant artificial 
30 chromosomes provided herein, the nucleic acid that is introduced into a cell 
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containing one or more plant chromosomes includes nucleic acid that for 
identification of cells containing the nucleic acid. Such nucleic acids include 
nucleic acid encoding a fluorescent protein, such as a green, blue or red 
fluorescent protein, and nucleic acid encoding a selectable marker, such as, 
5 for example, proteins that confer resistance to phosphinothricin, ammonium 
glufosinate, glyphosate, kanamycin, hydromycin, dihydrofolate or 
sulfonylurea. 

In embodiments of methods of producing plant artificial chromosomes 
in which nucleic acid is introduced into a cell containing one or more plant 

10 chromosomes, the cell can be cultured through two or more cell doublings, 
and typically from about 5 to about 60, or about 5 to about 55, or about 10 
to about 55, or about 25 to about 55, or about 35 to about 55 cell doublings 
following introduction of nucleic acid into a cell. The step of selecting a cell 
containing a plant artificial chromosome can include sorting of cells into 

15 which nucleic acid was introduced. For example, cells can be sorted on the 
basis of the presence of a selectable marker, such as a reporter protein, or 
by growing (culturing) the cells under selective conditions. The selection 
step can include fluorescent in situ hybridization (FISH) analysis of cells into 
which nucleic acid is introduced. 

20 Also provided are methods of producing a transgenic plant using 

artificial chromosomes that function in plants and transgenic plants 
containing artificial chromosomes. Artificial chromosomes used in the 
methods of producing transgenic plants can be of any species. For example, 
the artificial chromosomes can contain a centromere from species such as 

25 animals, e.g., mammals, birds, plants, or insects, that functions to segregate 
nucleic acids to daughter cells through cell division. In some embodiments 
of the methods for producing a transgenic plant, the artificial chromosomes 
contain repeat regions predominantly made up of repeats of one or more 
nucleic acid units. Repeats of a nucleic acid unit can share some common 

30 nucleic acid sequences, for example, a replication site involved in 
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amplification of chromosome segments and/or some heterologous nucleic 
acid. Repeats of a nucleic acid unit can be substantially identical. Common 
nucleic acid sequences of repeats of a nucleic acid unit can contain 
sequences that represent euchromatic and heterochromatic nucleic acid. 
5 Repeat regions of artificial chromosomes that can be used in the 

methods of producing a transgenic plant can be made up of substantially 
equivalent amounts of heterochromatic and euchromatic DNA or can be 
made up predominantly of heterochromatic DNA or can be made up 
predominantly of euchromatic DNA. The artificial chromosome can be made 

10 up predominantly of heterochromatic or euchromatic DNA or can be made up 
of substantially equivalent amounts of heterochromatin and euchromatin. 
Such artificial chromosomes that contain plant centromeres can contain a 
plant centromere from any species of plant, including monocots and dicots. 
For example, the centromere can be from Arabidopsis, tobacco, Helianthus, 

15 Solanum, Lycopersicon, Daucus, Hordeum, Zea, Brassica, Triticum, rye, 
wheat, radish, mung bean or Oryza. The artificial chromosomes can be 
made using methods described herein. 

In a method of producing a transgenic plant provided herein, an 
artificial chromosome, such as those described above and elsewhere herein, 

20 is introduced into a plant cell. The artificial chromosome can contain 

heterologous nucleic acid encoding a gene product such as, for example, an 
enzyme, antisense RNA, tRNA, rDNA, a structural protein, a marker or 
reporter protein, a ligand, a receptor, a ribozyme, a therapeutic protein, a 
biopharmaceutical protein, a vaccine, a blood factor, an antigen, a hormone, 

25 a cytokine, a growth factor or an antibody. The product can be one that 

provides for resistance to diseases, insects, herbicides or stress in the plant. 
The product can be one that provides for an agronomically important trait in 
the plant and/or that alters the nutrient utilization and/or improves the 
nutrient quality of the plant. Heterologous nucleic acid of an artificial 
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chromosome can be contained within a bacterial artificial chromosome (BAC) 
or a yeast artificial chromosome (YAC). 

The plant cell into which such artificial chromosomes can be 
introduced in methods of producing a transgenic plant provided herein can be 
5 any species of plant cell, including, but not limited to, Arabidopsis, tobacco, 
Helianthus, Solanum, Lycopersicon, Daucus, Hordeum, Zea, Brassica, 
Triticum, rye, wheat, radish, mung bean, Capsicum, lentil and Oryza. Any 
cell that can develop into a plant can be used, including plant cells and 
protoplasts of plant embryos, calli, tissues, meristem, organs, seeds, 

10 seedlings, pollen, pollen tubes or whole plants. 

Artificial chromosomes can be introduced into plant cells in the 
methods of producing a transgenic plant using any process for transfer of 
nucleic acids into plant cells, including, but not limited to chemical, physical 
and electrical processes and combinations thereof. For example, the artificial 

15 chromosomes can be transferred into plant cells via direct contact in the 
absence or presence of a fusogen, e.g., polyethylene glycol (PEG), calcium 
phosphate and/or lipid or they can be encapsulated in a lipid structure {e.g., a 
liposome) or contained within a protoplast or microcell which is then allowed 
to fuse (in the presence or absence of a fusogen such as PEG) with a plant 

20 cell for introduction of the artificial chromosome into the cell in a method of 
producing a transgenic plant. Artificial chromosomes can be transferred to 
plant cells that are subjected to electrical pulses {e.g., electroporation) and/or 
ultrasound {e.g., sonoporation) before, during and/or after exposure of the 
cells to the artificial chromosomes. Use of electrical pulses and/or ultrasound 

25 can be in combination with any other agents, e.g. , PEG and/or lipids, used in 
transferring nucleic acids into plant cells. Artificial chromosomes can also be 
physically injected into plant cells through a micropipette or needle or 
introduced into plant cells through bombardment of the cells with 
microprojectiles coated with the chromosomes. To facilitate transfer of 
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nucleic acids into plant cells, the recipient cells or tissue can be subjected to 
mechanical wounding. 

Plant cells into which artificial chromosomes have been introduced for 
purposes of producing a transgenic plant are cultured under conditions that 
5 permit generation of a whole plant therefrom. The transformed cells can be 
analyzed prior to use in the generation of whole plants to determine 
suitability. For example, the cells can be analyzed for the presence of 
artificial chromosomes and/or regenerative capacity. Plant regeneration 
techniques, many of which are known to those of skill in the art, can be 
10 used to generate whole plants from, for example, cells, embryos and calli 
containing artificial chromosomes. For example, plants can be regenerated 
from cells containing artificial chromosomes by the planting of transformed 
roots, plantlets, seed, seedlings, and any structure capable of growing into a 
whole plant. 

15 Further provided herein are methods for producing an acrocentric plant 

chromosome and methods for producing plant chromosomes containing 
adjacent regions of rDNA and heterochromatin, in particular, pericentric 
and/or satellite heterochromatin. Also provided herein are methods for 
generating acrocentric plant chromosomes containing adjacent regions of 

20 heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
and rDNA on the short arm of the chromosome. 

One embodiment of these methods includes steps of introducing 
nucleic acid containing two site-specific recombination sites into a cell 
containing one or more plant chromosomes, recombining nucleic acids of the 

25 two site-specific recombination sites, and selecting a cell containing an 
acrocentric plant chromosome and/or a plant chromosome containing 
adjacent regions of rDNA and heterochromatin. The two site-specific 
recombination sites can be contained on separate nucleic acid fragments 
which are introduced into the cell simultaneously or sequentially. 
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Other embodiments of the methods of producing an acrocentric plant 
chromosome and/or a plant chromosome that contains adjacent regions of 
rDNA and heterochromatin include steps of introducing a first nucleic acid 
containing a site-specific recombination site into a first plant chromosome, 
5 introducing a second nucleic acid containing a site-specific recombination 
site into a second plant chromosome, recombining nucleic acids of the first 
and second chromosomes and selecting a plant chromosome that is 
acrocentric or that contains adjacent regions of rDNA and heterochromatin. 
For example, to produce an acrocentric plant chromosome, the first nucleic 
10 acid can be introduced into or adjacent to the pericentric heterochromatin of 
the first chromosome and/or the second nucleic acid can be introduced into 
the distal end of the arm of the second chromosome. To produce an 
acrocentric plant chromosome containing adjacent regions of rDNA and 
heterochromatin, for example, the first nucleic acid can be introduced into or 
15 adjacent the pericentric heterochromatin on the short arm of an acrocentric 
plant chromosome and the second nucleic acid can be introduced into or 
adjacent to rDNA. To produce a plant chromosome containing adjacent 
regions of rDNA and heterochromatin, for example, the first nucleic acid can 
be introduced into or adjacent to heterochromatin, such as pericentric 
20 heterochromatin or satellite DNA, and the second nucleic acid can be 

introduced into or adjacent to rDNA. When the chromosomes are located 
within a cell, the method can include selecting a cell containing a plant 
chromosome that is acrocentric and/or that contains adjacent regions of 
rDNA and heterochromatin. 
25 Another embodiment of the methods of producing an acrocentric plant 

chromosome includes steps of introducing a first nucleic acid containing a 
site-specific recombination site into the pericentric heterochromatin of a plant 
chromosome, introducing a second nucleic acid containing a site-specific 
recombination site into the distal end of the chromosome in which the first 
30 and second recombination sites are located on the same arm of the 
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chromosome, recombining nucleic acids of the first and second 
recombination sites in the chromosome and selecting a plant chromosome 
that is acrocentric. 

Another method of producing an acrocentric plant chromosome or a 
5 plant chromosome containing adjacent regions of rDNA and heterochromatin 
includes steps of introducing nucleic acid containing a recombination site 
adjacent to or sufficiently near nucleic acid encoding a selectable marker into 
a first plant cell for recombination and introduction of the marker into the 
chromosome, generating a first transgenic plant from the first plant cell, 

10 introducing nucleic acid containing a promoter functional in a plant cell and a 
recombination site in operative linkage into a second plant cell, generating a 
second transgenic plant from the second plant cell, crossing the first and 
second plants, obtaining plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker, and selecting a 

15 resistant plant that contains cells containing an acrocentric plant 

chromosome or a plant chromosome containing adjacent regions of rDNA 
and heterochromatin. Methods of this embodiment can optionally include 
steps of selecting first and second transgenic plants such that one of the 
plants contains a chromosome containing a recombination site in a region 

20 within or adjacent to the pericentric heterochromatin and the other plant 
contains a chromosome containing a recombination site located within or 
adjacent to rDNA of the chromosome. These methods can further include 
the steps of selecting first and second transgenic plants where one of the 
plants contains a chromosome containing a recombination site located on a 

25 short arm of the chromosome in a region adjacent to the pericentric 
heterochromatin; and 

the other plant contains a chromosome containing a recombination site 
located in rDNA of the chromosome. In one embodiment, the recombination 
sites on the two chromosomes are in the same orientation. 
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In methods of producing an acrocentric plant chromosome, one or 
both of these recombination sites is located on a short arm of the 
chromosome. For example, one of the one of the plants contains a 
chromosome containing a recombination site in region within or adjacent to 
5 the pericentric heterochromatin located on the short arm of the chromosome. 
The selecting steps can further include selecting first and second transgenic 
plants such that the recombination sites on the two chromosomes are in the 
same orientation. 

In any of these methods of producing an acrocentric plant 

10 chromosome or a plant chromosome containing adjacent regions of rDNA 
and heterochromatin {in particular, pericentric heterochromatin and/or 
satellite DIMA), recombination between the first and second site-specific 
recombination sites can be provided for in a number of ways. For example, a 
recombinase activity can be introduced into a cell containing one or more 

15 chromosomes containing the sites which catalyzes the recombination 

reaction. The recombinase activity can be encoded by nucleic acid that is 
introduced into the cell simultaneously with nucleic acid containing a site- 
specific recombination site or that is introduced into the cell at a different 
time. Recombinase activity occurs within the cell upon expression of the 

20 nucleic acid encoding a recombinase activity, which can be operatively linked 
to a promoter functional in the cell. The recombinase activity can be 
constitutively expressed or can be induced, for example, by linking the 
nucleic acid encoding the recombinase to an inducible promoter. It is also 
possible that a cell into which nucleic acid containing site-specific 

25 recombination sites is introduced contains a recombinase enzyme which can 
be constitutively or inducibly expressed. Alternatively, a transgenic plant can 
be generated from cells containing the recombination sites and crossed with 
a transgenic plant containing nucleic acid encoding a recombinase. 

Any site-specific recombinase system known to those of skill in the 

30 art is contemplated for use herein. It is contemplated that one or a plurality 
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of sites that direct the recombination by the recombinase are introduced into 
the ACes (or other ACs) and then heterologous genes linked to the cognate 
site are introduced into an ACes to produce platform ACes. The resulting 
ACes are introduced into cells with nucleic acid encoding the cognate 
5 recombinase, typically on a vector, and nucleic acid encoding heterologous 
nucleic acid of interest linked to the appropriate recombination site for 
insertion into the ACes chromosome. The recombinase encoding nucleic 
acid may be introduced into the AC, includes ACes, or on the same or a 
difference vector from the heterologous nucleic acid. 

10 For the methods herein any recombinase enzyme that catalyzes site- 

specific recombination can be used to facilitate recombination between the 
first and second site-specific recombination sites. A variety of recombinases 
and attachment/recombination sites therefor are available and/or known to 
those of skill in the art. These include, but not limited to: the Cre//ox 

15 recombination system using CRE recombinase from the Escherichia coli 

phage P1 , the FLP/FRT system of yeast using the FLP recombinase from the 
2// episome of Saccharomyces cerevisiae, the resolvases, including Gin 
recombinase of phage Mu, Cin, Hin, a6 Tn3; the Pin recombinase of E. coli, 
the R/RS system of the pSR1 plasmid of Zygosaccharomyces rouxii site 

20 specific recombinases from Kluyveromyces drosophilarium and 
Kluyveromyces waltii and other systems are 

Also contempalted is the E. coli phage lambda integrase system, the phage 
lambda integrase and the cognate att sites (see, also copending application 
U.S. application Serial No. (attorney docket No. 24601-420, filed on the 

25 same day herewith)). 

In any of these methods of producing acrocentric plant chromosomes, 
nucleic acid containing a site-specific recombination site can also contain 
nucleic acid encoding a selectable marker. The nucleic acids used in the 
methods can be designed such that expression of the selectable marker 

30 occurs only upon the desired recombination event. 
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Acrocentric plant chromosomes produced by the methods provided 
herein can be of any composition. For example, the DNA of the short arm of 
the acrocentric chromosome can contain less than 5% or less than 1% 
euchromatic DNA or can contain no euchromatic DNA. Acrocentric plant 
5 artificial chromosomes in which the short arm of the acrocentric chromosome 
does not contain euchromatic DNA are provided. 

In another embodiment, a method of producing a plant artificial 
chromosome, that includes the steps of introducing nucleic acid into a plant 
cell acrocentric chromosome in which the short arm does not contain 
10 euchromatic DNA; culturing the cell through at least one cell division; and 
selecting a cell containing an artificial chromosome, such as one that is 
predominantly heterochromatic, is provided. The acrocentric chromosome is 
produced by the method of any the methods described herein or other 
suitable methods. 

15 in another embodiment, a method for producing an artificial 

chromosome, that includes the steps of introducing nucleic acid into a plant 
cell; and 

selecting a plant cell that includes an artificial chromosome that contains one 
or more repeat regions is provided. In this AC, one or more nucleic acid 

20 units is (are) repeated in a repeat region; repeats of a nucleic acid unit have 
common nucleic acid sequences; and the common sequences of 
nucleotides include sequences that represent euchromatic and 
heterochromatic nucleic acid. The nucleic acid can include plant rDNA from 
a dicot plant species or plant rDNA from a monocot plant species. The 

25 intergenic spacer region can be from DNA from a Nicotians plant or other 
suitable source of such DNA. The rDNA can be plant rDNA, and the plant 
can be a dicot or a monocot. 

Also provided are isolated plant artificial chromosomes that contain 
one or more repeat regions. In these ACs one or more nucleic acid units is 

30 (are) repeated in a repeat region; repeats of a nucleic acid unit have common 
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nucleic acid sequences; and the common sequences of nucleotides include 
sequences that represent euchromatic and heterochromatic nucleic acid. The 
artificial chromosome can be produced by a method that includes the steps 
of: introducing nucleic acid into a plant cell; and selecting a plant cell 
5 containing an artificial chromosome that contains one or more repeat regions. 
The repeats of a nucleic acid unit have common nucleic acid sequences; and 
the common nucleic acid sequences contain sequences that represent 
euchromatic and heterochromatic nucleic acid. 

In another embodiment, another method for producing an acrocentric 

10 plant chromosome is provided. The method includes the steps of: 

introducing nucleic acid containing two site-specific recombination sites into 
a cell containing one or more plant chromosomes; introducing into the cell a 
recombinase activity that catalyzes recombination between the two 
recombination sites to produce a plant acrocentric chromosome. In the 

15 embodiment, the two site-specific recombination sites can be on separate 
nucleic acid fragments, which optionally can be introduced into the cell 
simultaneously or sequentially. The resulting artificial chromosome can be 
one that is predominantly heterochromatic. 

In another embodiment, a method of producing a plant artificial 

20 chromosome is provided. The method includes the steps of: introducing 
nucleic acid into a plant chromosome, such as but not limited to, an 
acrocentric chromosome, in a cell that contains adjacent regions of rDNA and 
heterochromatic DNA; culturing the cell through at least one cell division; 
and selecting a cell containing an artificial chromosome. The resulting 

25 artificial chromosome can be predominantly heterochromatic. The 

acrocentric chromosome can be one where the short arm of the chromosome 
contains adjacent regions of rDNA and heterochromatic DNA, such as, but 
not limited to, pericentric heterochromatin. 

Also provided are a variety of vectors. Among these are vectors 

30 containing nucleic acid encoding a selectable marker that is not operably 
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associated with any promoter, wherein the selectable marker permits growth 
of animal cells in the presence of an agent normally toxic to the animal cells; 
and wherein the agent is not toxic to plant cells; a recognition site for 
recombination; and a sequence of nucleotides that facilitates amplification of 
5 a region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome. Exemplary of such vectors is pAglla and pAgllb. 

Another vector provided herein contains nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, wherein 
the selectable marker permits growth of animal cells in the presence of an 
10 agent normally toxic to the animal cells; and wherein the agent is not toxic to 
plant cells; a recognition site for recombination; and nucleic acid encoding a 
protein operably linked to a plant promoter. Exemplary of these vectors is 
pAg1 and pAg2. 

Another vector that is provided contains: nucleic acid encoding a 

15 selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of plant cells in the presence of an 
agent normally toxic to the plant cells but not toxic to animal cells; a 
recognition site for recombination; and nucleic acid encoding a protein 
operably linked to a plant promoter. 

20 Another vector is a plant transformation vector that contains nucleic 

acid encoding a recognition site for recombination; a sequence of nucleotides 
that facilitates or causes amplification of a region of a plant chromosome; 
one or more selectable markers that are expressed in plant cells to permit the 
selection of cells containing the vector, and Agrobacterium nucleic acid. The 

25 vector is for Agrobacterium-n\ed\a\ed transformation of plants. 

Another vector that is provided contains a recognition site for 
recombination; and a sequence of nucleotides that facilitates amplification of 
a region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome, wherein the plant is selected from the group 

30 consisting of Arabidopsis, Nicotiana, Solanum, Lycopersfcon, Daucus, 
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Hordeum, Zea mays, Brassica, Triticum, Helianthus, soybean, cotton and 
Oryza. 

In these vectors, the amplifiable region can contain heterochromatic 
nucleic acid; the amplifiable region can contain rDNA. Exemplary sequences 
5 of nucleotides that facilitates amplification of a region of a plant chromosome 
or targets the vector to an amplifiable region of a plant chromosome are any 
that contain a sufficient portion of an intergenic spacer region of rDNA to 
facilitate amplification or effect the targeting. Such sufficient portion can be 
at least 14, 20, 30, 50, 100, 150, 300, 500, 1 kB, 2 kB, 3 kB, 5 kB, 10 kB 

10 or more contiguous nucleotides from an intergenic spacer region and/or other 
rDNA region. An exemplary selectable marker encodes a product confers 
resistance to zeomycin. The protein in the vectors include a protein that is a 
selectable marker that permits growth of plant cells in the presence of an 
agent normally toxic to the plant cells, such as, for example, resistance to 

15 hygromycin or to phosphothricin. Other such protein markers include, but 
are not limited to, fluorescent proteins, such as, for example, green, blue 
and red fluorescent proteins. An exemplary recognition site contains an att 
site. Exemplary promoters for inclusion in the vectors, include, but are not 
limited to, nopaline synthase (NOS) or CaMV35S. 

20 Cell, containing any of the vectors or mixtures thereof are provided. 

The cells include any cells that have at least one plant chromosome, such as 
a plant cell. The cells can be protoplasts. 

Methods using these vectors are provided. The methods includes a 
step of introducing one of the vectors into a cell, such as a cell that 

25 contains at least one plant chromosome. Such vector is for example, a 
vector that contains nucleic acid encoding a selectable marker that is not 
operably associated with any promoter, where the selectable marker permits 
growth of animal cells in the presence of an agent normally toxic to the 
animal cells but is not toxic to plant cells; a recognition site for 

30 recombination; and 
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nucleic acid encoding a protein operably linked to a plant promoter. In this 
method, the cell contains an animal, such as a mammal, platform ACes that 
contains a recognition site, such as, for example, an att site, that recombines 
with the recognition site in the vector in the presences of the recombinase 
5 therefor, thereby incorporating the selectable marker that is not operably 
associated with any promoter and the nucleic acid encoding a protein 
operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. The platform ACes can contain a promoter that, 
upon recombination, is operably linked to the selectable marker that in the 

10 vector is not operably associated with a promoter. The method can further 
include transferring the resulting platform ACes into a plant cell to produce a 
plant cell that contains the platform Aces. The method optionally further 
includes culturing the plant cell that contains the platform Aces under 
conditions whereby the protein encoded by the nucleic acid that is operably 

15 linked to a plant promoter is expressed. 

The resulting platform ACes optionally is isolated prior to transfer. 
The Aces can be introduced into a plant cell by any suitable method, such as 
one selected from among protoplast transfection, lipid-mediated delivery, 
liposomes, electroporation, sonoporation, microinjection, particle 

20 bombardment, silicon carbide whisker-mediated transformation, polyethylene 
glycol (PEG)-mediated DNA uptake, lipofection and lipid-mediated carrier 
systems. The resulting platform ACes can be transferred by fusion of the 
cells, which, for example, are plant protoplasts. In another embodiment, the 
cell can be an animal cell, such as a mammalian, including human, cell. 

25 

In another, method a vector is introduced into plant cells. Such 
vector, for example, can be a vector that includes nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of animal cells in the presence of an 
30 agent normally toxic to the animal cells but is not toxic to plant cells; a 
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recognition site for recombination; and a sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the 
vector to an amplifiable region of a plant chromosome. The plant cells are 
cultured and a plant celi(s) containing an artificial chromosome that contains 
5 one or more repeat regions is selected. In this method, a sufficient portion of 
the vector can integrates into a chromosome in the plant cell to result in 
amplification of chromosomal DNA. The resulting selected artificial 
chromosome can be on in which one or more nucleic acid units is (are) 
repeated in a repeat region; repeats of a nucleic acid unit have common 
10 nucleic acid sequences; and the repeat region(s) contain substantially 

equivalent amounts of euchromatic and heterochromatic nucleic acid. The 
resulting artificial chromosome produced in the method optionally can be 
isolated. 

Anther method is also provided. This method includes the steps of 

1 5 introducing a vector into a cell, and culturing the resulting cell under 

conditions, whereby the protein encoded by nucleic acid operably linked to 
an animal promoter is expressed. In the method the vector can contains: 
nucleic acid encoding a selectable marker that is not operably associated 
with any promoter, where the selectable marker permits growth of animal 

20 cells in the presence of an agent normally toxic to the animal cells but is not 
toxic to plant cells; a recognition site for recombination; and nucleic acid 
encoding a protein operably linked to an animal promoter. The cell can 
contain a platform plant artificial chromosome (PAC) that contains a 
recombination site and an animal promoter that upon recombination is 

25 operably linked to the selectable marker that in the vector is not operably 
associated with a promoter. Introduction can be effected under conditions 
whereby the vector recombines with the PAC to produce a plant platform 
PAC that contains the selectable marker operably linked to the promoter. In 
this method, the artificial chromosome can be an ACes. In addition, the 

30 plant platform PAC can be an ACes. 
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The vectors, such as those that contain nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of animal cells in the presence of an 
agent normally toxic to the animal cells but is not toxic to plant cells; a 
5 recognition site for recombination; and a sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the 
vector to an amplifiable region of a plant chromosome, and the plant 
transformation vectors that contain nucleic acid for Agrobacterium-med\ated 
transformation of plants, can be used to produce artificial chromosomes. In 
10 one exemplary method, such vector is introduced into a cell containing one 
or more plant chromosomes; and 

a cell containing an artificial chromosome that contains one or more repeat 
regions is selected. The artificial chromosome contains one or more nucleic 
acid units that is (are) repeated in a repeat region; the repeats of a nucleic 

15 acid unit have common nucleic acid sequences; and the common nucleic acid 
sequences contain sequences that represent euchromatic and 
heterochromatic nucleic acid. In another method, a cell containing an 
artificial chromosome that contains one or more repeat regions is selected. 
The artificial chromosome contains one or more nucleic units that is (are) 

20 repeated in a repeat region; repeats of a nucleic acid unit have common 
nucleic acid sequences; and 

the repeat region(s) contain substantially equivalent amounts of euchromatic 
and heterochromatic nucleic acid. 
DESCRIPTION OF THE DRAWINGS 
25 Figure 1 provides a map of plasmid pAgl. 

Figure 2 provides a schematic representation of the construction of 
plasmid pAgl . 

Figure 3 provides a map of plasmid pAg2. 

Figure 4 provides a schematic representation of the construction of 
30 plasmid pAg2. 
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Figure 5 provides a schematic representation of the construction of 
plasmids pAglla and pAgltb. 

Figure 6A-6B provide restriction maps of the DNA inserted into pAgl 
to form plasmids pAglla and pAgllb. 
5 Figure 7 provides a map of plasmid pSV401 93attPsensePUR. 

Figure 8 depicts a method for formation of a chromosome platform 
with multiple recombination integration sites, such as attP sites. 

Figure 9 diagrammatically summarizes the platform technology; 
marker 1 permits selection of the artificial chromosomes containing the 
10 integration site; marker 2, which is promoterless in the donor vector permits 
selection of recombinants. Upon recombination with the platform marker 2 
is expressed under the control of a promoter resident on the platform. 
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

Definitions 

15 Unless defined otherwise, all technical and scientific terms used herein 

have the same meaning as is commonly understood by one of skill in the art 
to which this invention belongs. All patents, patent applications, published 
applications and other publications and published nucleotide and amino acid 
sequences (e.g., sequences available in GenBank or other databases) referred 

20 to herein are incorporated by reference in their entirety. Where reference is 
made to a URL or other such identifier or address, it is understood that such 
identifiers can change and particular information on the internet can come 
and go, but equivalent information can be found by searching the internet. 
Reference thereto evidences the availability and public dissemination of such 

25 information. 

As used herein, a chromosome is a defined composition of nucleic 
acid that is capable of replication and segregation within a cell upon cell 
division. Typically, a chromosome may contain a centromeric region, 
telomeric regions and a region of nucleic acid between the centromeric and 

30 telomeric regions. 
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As used herein, a centromere is a molecular composition that includes 
a nucleic acid sequence that confers an ability to segregate to daughter cells 
through cell division. A centromere may confer stable segregation of a 
nucleic acid sequence, including an artificial chromosome containing the 
5 centromere, through mitotic and/or meiotic divisions. A plant centromere is 
not necessarily derived from plants, but has the ability to promote DNA 
segregation in plant cells. 

As used herein, euchromatin and heterochromatin have their 
recognized meanings. Euchromatin refers to chromatin that stains diffusely 

10 and that typically contains genes, and heterochromatin refers to chromatin 
that remains unusually condensed and that has been thought to be 
transcriptionally inactive or has low transcriptional activity relative to 
euchromatin. Highly repetitive DNA sequences (satellite DNA) are usually 
located in regions of the heterochromatin surrounding the centromere 

15 (pericentric or pericentromeric heterochromatin). Constitutive 

heterochromatin refers to heterochromatin that contains the highly repetitive 
DNA which is constitutively condensed and genetically inactive. 

As used herein, an acrocentric chromosome refers to a chromosome 
with arms of unequal length. 

20 As used herein, endogenous chromosomes refer to genomic chromo- 

somes as found in the cell prior to generation or introduction of an artificial 
chromosome. 

As used herein, artificial chromosomes are nucleic acid molecules, 
typically DNA, that stably replicate and segregate alongside endogenous 

25 chromosomes in cells and have the capacity to accommodate and express 
heterologous genes contained therein. A mammalian artificial chromosome 
(MAC) refers to a chromosome that has an active mammalian centromere(s). 
Plant artificial chromosomes (PAC), insect artificial chromosomes and avian 
artificial chromosomes refer to chromosomes that include centromeres that 

30 function in plant, insect and avian cells, respe ctively. Human artificial 
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chromosomes (HAC) refers to chromosomes that include centromeres that 
function in human cells. For exemplary artificial chromosomes, see, e.g., 
U.S. Patent Nos. 6,025 J 55; 6,077,697; 5,288,625; 5,712,134; 
5,695,967; 5,869,294; 5,891,691 and 5,721,118 and published 
5 International PCT application Nos, WO 97/40183 and WO 98/08964. 

As used herein, amplification, with reference to DNA, is a process in 
which segments of DNA are duplicated to yield two or multiple copies of 
substantially similar or identical or nearly identical DNA segments that are 
typically joined as substantially tandem or successive repeats or inverted 
10 repeats. 

As used herein, amplification-based artificial chromosomes are 
artificial chromosomes derived from natural or endogenous chromosomes by 
virtue of an amplification event, such as one that may be initiated by 
introduction of heterologous nucleic acid into heterochromatin, for example, 

15 pericentric heterochromatin, in a chromosome. As a result of such an event, 
chromosomes and/or fragments thereof exhibiting segmented or repeating 
patterns arise. Artificial chromosomes can be formed from these 
chromosomes and fragments. Hence, amplification-based artificial 
chromosomes refer to non-natural or isolated chromosomes that exhibit an 

20 ordered segmentation that is not typically observed in naturally occurring 
chromosomes and that can be a basis for distinguishing them from naturally 
occurring chromosomes. Amplification-based artificial chromosomes can 
also be distinguished from naturally occurring chromosomes by virtue of their 
typically smaller size and often segmented appearance when visualized. The 

25 segmented appearance, which can be visualized using a variety of 

chromosome analysis techniques as described herein and known to those of 
skill in the art, correlates with the unique structure of these artificial 
chromosomes. In addition to containing one or more centromeres, the 
amplification-based artificial chromosomes, throughout the region or regions 

30 of segmentation, are predominantly made up of one or more nucleic acid 
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units, also referred to as "amplicons", that is (are) repeated in the region and 
that have a similar gross structure. Thus, a region of segmentation may be 
referred to as a repeat region. Repeats of an amplicon tend to be of similar 
size and share some common nucleic acid sequences. For example, each 
5 repeat of an amplicon may contain a replication site involved in amplification 
of chromosome segments and/or some heterologous nucleic acid that was 
utilized in the initial production of the artificial chromosome. Typically, the 
repeating units are substantially similar in nucleic acid composition and may 
be nearly identical. The common nucleic acid sequences may contain 

10 sequences that represent euchromatic and heterochromatic nucleic acid. 
Amplicon sizes vary but typically tend to be greater than about 100 kb, 
greater than about 500 kb, greater than about 1 Mb, greater than about 5 
Mb or greater than about 10 Mb. The composition of the amplification-based 
artificial chromosomes may be such that substantially the entire chromosome 

15 exhibits a segmented appearance or such that only one or more portions that 
make-up less than the entire chromosome appear segmented. The 
amplification-based artificial chromosomes can also differ depending on the 
chromosomal region that has undergone amplification in the process of 
artificial chromosome formation. The structures of the resulting 

20 chromosomes can vary depending upon the initiating event and/or the 

conditions under which the heterologous nucleic acid is introduced, including 
modification to the endogenous chromosomes. For example, in some of the 
artificial chromosomes provided herein, the region or regions of segmentation 
may be made up predominantly of heterochromatic DNA. In other artificial 

25 chromosomes provided herein, the region or regions of segmentation may be 
made up predominantly of euchromatic DNA or may be made up of similar 
amounts of heterochromatic and euchromatic DNA. The region or regions of 
segmentation thus may be entirely heterochromatic (while still containing one 
or more heterologous nucleic acid sequences), or may contain increasing 

30 amounts of euchromatic DNA, such that, for example, the region contains 
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about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater than 
90% euchromatic DNA. Because the entire artificial chromosome can be 
made up predominantly of a region or regions of segmentation, it is thus 
possible for the artificial chromosome to be made up predominantly of 
5 heterochromatin or euchromatin, or to be made up of substantially equivalent 
amounts of heterochromatin and euchromatin, e.g., about 40% to about 
50% of one type of nucleic acid and about 50% to about 60% of the other 
type of nucleic acid. 

As used herein the term "predominantly" with respect to a 

10 composition generally refers to a state of the composition in which it can be 
characterized as being or having more of the predominant feature than other 
features which are not predominant. The predominant feature may represent 
more than about 50%, more than about 60%, more than about 70%, more 
than about 80%, more than about 90%, more than about 95% or essentially 

15 100% of the composition. Thus, for example, a repeat region that is 
predominantly made up of heterochromatic DNA contains more 
heterochromatic DNA than other types, e.g., euchromatic, of DNA. The 
repeat region may be more than about 50%, more than about 60%, more 
than about 70%, more than about 80%, more than about 90% or more than 

20 about 95% heterochromatic DNA or may be essentially 100% 

heterochromatic DNA. An artificial chromosome predominantly made up of 
heterochromatin contains more heterochromatic DNA than other types, e.g., 
euchromatic, of DNA and may be more than about 50%, more than about 
60%, more than about 70%, more than about 80%, more than about 90% 

25 or more than about 95% heterochromatic DNA or may be essentially 100% 
heterochromatic DNA. 

As used herein an amplicon is a repeated nucleic acid unit. In some of 
the artificial chromosomes described herein, an amplicon may contain a set 
of inverted repeats of a megareplicon. A megareplicon represents a higher 

30 order replication unit. For example, with reference to some of the 
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predominantly heterochromatic artificial chromosomes, particularly eukaryotic 
chromosomes, described herein, the megareplicon may contain a set of 
tandem DNA blocks (e.g., -7.5 Mb DNA blocks) each containing satellite 
DNA flanked by non-satellite DNA or may substantially be made up of rDNA. 
5 Contained within the megareplicon is a primary replication site, referred to as 
the megareplicator, which may be involved in organizing and facilitating 
replication of segments of chromosomes, including, for example, 
heterochromatin, pericentric heterochromatin, rDNA and/or possibly the 
centromeres. Within the megareplicon there may be smaller (e.g., 50-300 

10 kb) secondary replicons. As used herein, amplifiable, when used in 

reference to a chromosome, particularly the method of generating artificial 
chromosomes provided herein, refers to a region of a chromosome that is 
prone to amplification. Amplification typically occurs during replication and 
other cellular events involving recombination (e.g., DNA repair). Included 

15 among such regions are regions of the chromosome that contain tandem 
repeats, such as satellite DNA, rDNA, and other such sequences. 

Among the artificial chromosome systems provided herein are those 
that are predominantly heterochromatic [formerly referred to as satellite 
artificial chromosomes (SATACs); see, e.g., U.S. Patent Nos. 6,077,697 

20 and 6,025,155 and published International PCT application No. 

WO 97/40183], minichromosomes which contain a de novo centromere, 
artificial chromosomes containing one or more regions of repeating nucleic 
acid units wherein the repeat region(s) contain substantially equivalent 
amounts of euchromatic and heterochromatic nucleic acid and in vitro 

25 assembled artificial chromosomes. Of particular interest herein are artificial 
chromosomes that introduce and express heterologous nucleic acids in 
plants. These include artificial chromosomes that have a centromere derived 
from a plant, and, also, artificial chromosomes that have centromeres that 
may be derived from other organisms but that function in plants. Methods 
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for the construction, isolation, and delivery to target cells of each type of 
artificial chromosome are provided herein. 

As used herein, to target nucleic acid to a locus on a chromosome 
means that the nucleic acid integrates at or near the targeted locus. Any 
5 method or means for effecting such integration, including, but not limited to, 
homologous recombination, is contemplated. 

As used herein, a dicentric chromosome is a chromosome that 
contains two centromeres. A multicentric chromosome contains more than 
two centromeres. 

10 As used herein, a formerly dicentric chromosome is a chromosome 

that is produced when a dicentric chromosome fragments and acquires new 
telomeres so that two chromosomes, each having one of the centromeres, 
are produced. Each of the fragments are replicable chromosomes. If one of 
the chromosomes undergoes amplification of primarily euchromatic DNA to 

15 produce a fully functional chromosome that is predominantly (more than 
about 50%, more than about 70% or more than about 90% euchromatin) 
euchromatin, it is a minichromosome. The remaining chromosome is a 
formerly dicentric chromosome. If one of the chromosomes undergoes 
amplification, whereby heterochromatin (such as, for example, satellite DNA) 

20 is amplified and a euchromatic portion (such as, for example, an arm) 

remains, it is referred to as a sausage chromosome. A chromosome that is 
substantially all heterochromatin, except for portions of heterologous DNA, is 
called a predominantly heterochromatic artificial chromosome. Predominantly 
heterochromatic artificial chromosomes can be produced from other partially 

25 heterochromatic artificial chromosomes by culturing the cell containing such 
chromosomes under conditions that destabilize the chromosome and/or under 
selective conditions so that a predominantly heterochromatic artificial 
chromosome is produced. For purposes herein, it is understood that the 
artificial chromosomes may not necessarily be produced in multiple steps, 

30 but may appear after the initial introduction of the heterologous DNA. 
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Typically, artificial chromosomes appear after about 5 to about 60, or about 
5 to about 55, or about 10 to about 55 or about 25 to about 55 or about 35 
to about 55 cell divisions following introduction of nucleic acid into a cell. 
Artificial chromosomes may, however, appear after only about 5 to about 1 5 
5 or about 10 to about 15 cell divisions. 

As used herein, the term "satellite DNA-based artificial chromosome 
(SATAC)" is interchangable with the term "artificial chromosome expression 
system (ACes)". These artificial chromosomes (ACes) include those that are 
substantially all neutral non-coding sequences (heterochromatin) except for 

10 foreign heterologous, typically gene or protein-encoding, nucleic acid, that 
may be interspersed within the heterochromatin for the expression therein 
(see U.S. Patent Nos. 6,025,155 and 6,077,697 and International PCT 
application No. WO 97/40183), or that is in a single locus as provided 
herein. The delineating structural feature is the presence of repeating units, 

15 which are generally predominantly heterochromatin. The precise structure of 
the ACes will depend upon the structure of the chromosome in which the 
initial amplification event occurs; all share the common feature of including a 
defined pattern of repeating units. Generally ACes have more 
heterochromatin than euchromatin. Foreign nucleic acid molecules 

20 (heterologous genes) contained in these artificial chromosome expression 
systems can include any nucleic acid whose expression is of interest in a 
particular host cell. 

As used herein, an artificial chromosome that is predominantly 
heterochromatic {i.e., containing more heterochromatin than euchromatin, 

25 typically more than about 50%, more than about 60%, more than about 

70%, more than about 80% or more than about 90% heterochromatin) may 
be produced by introducing nucleic acid molecules into cells, particularly 
plant cells, and selecting cells that contain a predominantly heterochromatic 
artificial chromosome. Any nucleic acid may be introduced into cells in the 

30 methods of producing the artificial chromosomes. For example, the nucleic 
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acid may contain a selectable marker and/or a sequence that targets nucleic 
acid to a heterochromatic region of a chromosome, particularly a plant 
chromosome, such as in the pericentric heterochromatin, in the short arm of 
acrocentric chromosomes, rDNA or nucleolar organizing regions. Targeting 
5 sequences include, but are not limited to, lambda phage DNA and rDNA 
{e.g., a sequence of an intergenic spacer of rDNA), particularly plant rDNA, 
for production of predominantly heterochromatic artificial chromosomes in 
plant cells. 

After introducing the nucleic acid into cells, a cell containing a 

10 predominantly heterochromatic artificial chromosome is selected. Such cells 
may be identified using a variety of procedures. For example, repeating units 
of heterochromatic DNA of these chromosomes may be discerned by G- 
and/or C-banding and/or fluorescence in situ hybridization (FISH) techniques. 
Prior to such analyses, the cells to be analyzed may be enriched with 

15 artificial chromosome-containing cells by sorting the cells on the basis of the 
presence of a selectable marker, such as a reporter protein, or by growing 
(culturing) the cells under selective conditions. Selection of cells containing 
amplified nucleic acids may also be facilitated by use of techniques such as 
PCR and Southern blotting to identify cell lines with amplified regions. It is 

20 also possible, after introduction of nucleic acids into cells, to select cells that 
have a multicentric, typically dicentric, chromosome, a formerly multicentric 
(typically dicentric) chromosome and/or various heterochromatic structures 
and to treat them such that desired artificial chromosomes are produced. 
Conditions for generation of a desired structure include, but are not limited 

25 to, further growth under selective conditions, introduction of additional 
nucleic acid molecules and/or growth under selective conditions and 
treatment with destabilizing agents, and other such methods (see 
International PCT application No. WO 97/40183 and U.S. Patent Nos. 
6,025,155 and 6,077,697). 
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As used herein, heterologous and foreign are used interchangeably 
with respect to nucleic acid and refer to any nucleic acid, including DNA and 
RNA, that does not occur naturally as part of the genome in which it is 
present or which is found in a location or locations in the genome that differ 
5 from that in which it occurs in nature. Thus, heterologous or foreign nucleic 
acid that is not normally found in the host genome in an identical context. It 
is nucleic acid that is not endogenous to the cell and has been exogenously 
introduced into the cell. Examples of heterologous DNA include, but are not 
limited to, DNA that encodes a gene product or gene product(s) of interest, 

10 introduced for purposes of modification of the endogenous genes or for 
production of an encoded protein. For example, a heterologous or foreign 
gene may be isolated from a different species than that of the host genome, 
or alternatively, may be isolated from the host genome but operably linked to 
one or more regulatory regions which differ from those found in the 

15 unaltered, native gene. Other examples of heterologous DNA include, but 
are not limited to, DNA that encodes traceable marker proteins, and DNA 
that encodes a protein that confers an input trait including, but not limited to, 
herbicide, insect, or disease resistance or an output trait, including, but not 
limited to, oil quality or carbohydrate composition. Antibodies that are 

20 encoded by heterologous DNA may be secreted, sequestered, stored in an 
organ or tissue, accumulate in the cytoplasm or cellular organelles or 
expressed on the surface of the cell in which the heterologous DNA has been 
introduced. 

As used herein, a "selectable marker" is a composition that can be 
25 used to distinguish one cell from another cell. For example, a selectable 
marker may be a nucleic acid encoding a readily detected protein that has 
been introduced into some cells but not others. Detection of the expressed 
protein in cells facilitates identification of cells containing the marker nucleic 
acid by distinguishing them from cells that do not contain the nucleic acid. 
30 Thus, for example, a selectable marker may be a fluorescent protein, such as 
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green fluorescent protein (GFP), or /?-galactosidase (or a nucleic acid 
encoding either of these proteins). Selectable markers such as these, which 
are not required for cell survival and/or proliferation in the presence of a 
selection agent, may also be referred to as reporter molecules. Other 
5 selectable markers, e.g., the neomycin phosphotransferase gene, provide for 
isolation and identification of cells containing them by conferring properties 
on the cells that make them resistant to an agent, e.g., a drug such as an 
antibiotic, that inhibits proliferation of cells that do not contain the marker. 

As used herein, growth under selective conditions means growth of a 
10 cell under conditions that require expression of a selectable marker for 
survival. 

As used herein, an agent that destabilizes a chromosome is any agent 
known by those of skill in the art to enhance amplification events, and/or 
mutations. Such agents, which include BrdU, are well known to those of 

15 skill in the art. 

In order to generate an artificial chromosome containing a particular 
heterologous nucleic acid of interest, it is possible to include the nucleic acid 
of interest in the nucleic acid that is being introduced into cells to initiate 
production of the artificial chromosome. Thus, for example, a nucleic acid of 

20 interest could be introduced into a cell along with nucleic acid encoding a 
selectable marker and/or a nucleic acid that targets to a heterochromatic 
region of a chromosome. For example, the nucleic acid of interest can be 
linked to targeting nucleic acid{s). Alternatively, heterologous nucleic acid of 
interest can be introduced into an artificial chromosome at a later time after 

25 the initial generation of the artificial chromosome. 

As used herein, the minichromosome refers to a chromosome derived 
from a multicentric, typically dicentric, chromosome that contains more 
euchromatic than heterochromatic DNA. For purposes herein, the 
minichromosome contains a de novo centromere, preferably a centromere 

30 that replicates in plants, more preferably a plant centromere. 



WO 2002/096923 



PCT/US2002/017451 



-36- 

As used herein, de novo with reference to a centromere, refers to 
generation of an excess centromere in a chromosome as a result of 
incorporation of a heterologous nucleic acid fragment using the methods 
herein. 

5 As used herein, in vitro assembled artificial chromosomes or synthetic 

chromosomes are artificial chromosomes produced by joining essential 
components of a chromosome in vitro. These components include at least a 
centromere, a telomere and an origin of replication. An in vitro assembled 
artificial chromosome may include one or more megareplicators. In particular 
10 embodiments, the megareplicator contains sequences of rDNA, particularly 
plant rDNA. 

As used herein, in vitro assembled plant artificial chromosomes are 
produced by joining components (e.g., the centromere, telomere(s) 
megareplicator and an origin of replication) that function in plants, and 

15 preferably, one or more of which is derived from a plant. In vitro assembled 
artificial chromosomes may contain any amount of heterochromatic and/or 
euchromatic nucleic acid. For example, an in vitro assembled artificial 
chromosome may be substantially all heterochromatin, or may contain 
increasing amounts of euchromatic DNA, such that, for example, it contains 

20 about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater than 
about 90% euchromatic DNA. In vitro assembled artificial chromosomes 
may contain one or more regions of segmentation as described with 
reference to amplification-based artificial chromosomes. 

As used herein, an artificial chromosome platform refers to an artificial 

25 chromosome that has been engineered to include one or more sites for site 
specific recombination-directed integration. Included within the artificial 
chromosome platforms are ACes, particularly plant ACes, that are so- 
engineered. Any sites, including but not limited to any described herein, that 
are suitable for such integration are contemplated. Among the ACes 

30 contemplated herein are those that are predominantly heterochromatic 
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(formerly referred to as satellite artificial chromosomes (SATACs); see, e.g., 
U.S. Patent Nos. 6,077,697 and 6,025,155 and published International PCT 
application No. WO 97/40183), artificial chromosomes predominantly made 
up of repeating nucleic acid units and that contain substantially equivalent 
5 amounts of euchromatic and heterochromatic DNA or wherein the repeat 
regions of the chromosomes contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. Included among the ACes for 
use in generating platforms are artificial chromosomes that introduce and 
express heterologous nucleic acids in plants as described herein. These 

10 include artificial chromosomes that have a centromere derived from a plant, 
and, also, artificial chromosomes that have centromeres that may be derived 
from other organisms but that function in plants. 

As used herein, recognition sequences are particular sequences of 
nucleotides that a protein, DNA, or RNA molecule, or combinations thereof, 

15 (such as, but not limited to, a restriction endonuclease, a modification 
methylase and a recombinase) recognizes and binds. For example, a 
recognition sequence for Cre recombinase (see, e.g., SEQ ID No. 30) is a 34 
base pair sequence containing two 1 3 base pair inverted repeats (serving as 
the recombinase binding sites) flanking an 8 base pair core and designated 

20 loxP (see, e.g., Sauer (1994) Current Opinion in Biotechnoiogy 5:521-527). 
Other examples of recognition sequences, include, but are not limited to, 
attB and attP, attR and attL and others (see, e.g., SEQ ID Nos. 32-48), that 
are recognized by the recombinase enzyme Integrase (see, SEQ ID Nos. 49 
and 50) for the nucleotide and encoded amino acid sequences of an 

25 exemplary lambda phage integrase). 

The recombination site designated attB is an approximately 33 base 
pair sequence containing two 9 base pair core-type Int binding sites and a 7 
base pair overlap region; attP (SEQ ID No. 48) is an approximately 240 base 
pair sequence containing core-type Int binding sites and arm-type Int binding 

30 sites as well as sites for auxiliary proteins IHF, FIS, and Xis (see, e.g., Landy 
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(1993) Current Opinion in Biotechnology 3:699-7071 see, e.g., SEQ ID Nos. 
32 and 48). 

As used herein, a recombinase is an enzyme that catalyzes the 
exchange of DNA segments at specific recombination sites. An integrase 
5 herein refers to a recombinase that is a member of the lambda {A) integrase 
family. 

As used herein, recombination proteins include excisive proteins, 
integrative proteins, enzymes, co-factors and associated proteins that are 
involved in recombination reactions using one or more recombination sites 

10 (see, Landy (1993) Current Opinion in Biotechnology 3:699-707). 

As used herein the expression "lox site" means a sequence of 
nucleotides at which the gene product of the ere gene, referred to 
herein as Cre, can catalyze a site-specific recombination event. A LoxP site 
is a 34 base pair nucleotide sequence from bacteriophage P1 (see, e.g., 

15 Hoess etal. (1982) Proc. Natl. Acad. ScL U.S.A. 73:3398-3402). The LoxP 
site contains two 13 base pair inverted repeats separated by an 8 base pair 
spacer region as follows: (SEQ ID NO. 51): 

ATAACTTCGTATA ATGTATGC TATACGAAGTTAT 
£. co//DH5Alac and yeast strain BSY23 transformed with plasmid pBS44 

20 carrying two loxP sites connected with a LEU2 gene are available from the 
American Type Culture Collection (ATCC) under accession numbers ATCC 
53254 and ATCC 20773, respectively. The lox sites can be isolated from 
plasmid pBS44 with restriction enzymes EcoRI and Sa/I, or Xho\ and BamH\. 
In addition, a preselected DNA segment can be inserted into pBS44 at either 

25 the Sa/I or BamYW restriction enzyme sites. Other lox sites include, but are 
not limited to, LoxB, LoxL, LoxC2 and LoxR sites, which are nucleotide 
sequences isolated from E. coli (see, e.g., Hoess et al. (1982) Proc. Natl. 
Acad. Sci. U.S.A. 73:3398). Lox sites can also be produced by a variety of 
synthetic techniques (see, e.g., Ito et al. (1982) Nuc. Acid Res. 70:1755 and 

30 Ogilvie et al. (1 981 ) Science 270:270). 
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As used herein, the expression "ere gene" means a sequence of 
nucleotides that encodes a gene product that effects site-specific 
recombination of DNA in eukaryotic cells at lox sites. One ere gene can be 
isolated from bacteriophage P1 (see, e.g., Abremski et al. (1983) Cell 
5 32:1301-131 1). E. coll DH1 and yeast strain BSY90 transformed with 
plasmid pBS39 carrying a ere gene isolated from bacteriophage P1 and a 
GAL1 regulatory nucleotide sequence are available from the American Type 
Culture Collection (ATCC) under accession numbers ATCC 53255 and ATCC 
20772, respectively. The ere gene can be isolated from plasmid pBS39 with 

10 restriction enzymes Xho\ and Sal\. 

As used herein, site-specific recombination refers to site-specific 
recombination that is effected between two specific sites on a single nucleic 
acid molecule or between two different molecules that requires the presence 
of an exogenous protein, such as an integrase or recombinase. 

15 For example, Cre-lox site-specific recombination can include the 

following three events: 

a. deletion of a pre-selected DNA segment flanked by lox 

sites; 

b. inversion of the nucleotide sequence of a pre-selected 
20 DNA segment flanked by lox sites; and 

c. reciprocal exchange of DNA segments proximate to lox 
sites located on different DNA molecules. 

This reciprocal exchange of DNA segments can result in an integration 
event if one or both of the DNA molecules are circular. DNA segment refers 

25 to a linear fragment of single- or double-stranded deoxyribonucleic acid 
(DNA), which can be derived from any source. Since the lox site is an 
asymmetrical nucleotide sequence, two lox sites on the same DNA molecule 
can have the same or opposite orientations with respect to each other. 
Recombination between lox sites in the same orientation results in a deletion 

30 of the DNA segment located between the two lox sites and a connection 
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between the resulting ends of the original DNA molecule. The deleted DNA 
segment forms a circular molecule of DNA. The original DNA molecule and 
the resulting circular molecule each contain a single lox site. Recombination 
between lox sites in opposite orientations on the same DNA molecule result 
5 in an inversion of the nucleotide sequence of the DNA segment located 
between the two lox sites. In addition, reciprocal exchange of DNA 
segments proximate to lox sites located on two different DNA molecules can 
occur. All of these recombination events are catalyzed by the gene product 
of the ere gene. Thus, the Cre-lox system can be used to specifically delete, 

10 invert, or insert DNA. The precise event is controlled by the orientation of 
lox DNA sequences, in c/s the lox sequences direct the Cre recombinase to 
either delete (lox sequences in direct orientation) or invert (lox sequences in 
inverted orientation) DNA flanked by the sequences, while in trans the lox 
sequences can direct a homologous recombination event resulting in the 

15 insertion of a recombinant DNA. 

As used herein, a plant refers to an organism that is taxonomically 
classifed as being in the kingdom Plantae. Such organisms include 
eukaryotic organisms that contain chloroplasts capable of carrying out 
photosynthesis. A plant can be unicellular or multicellular and can contain 

20 multiple tissues and/or organs. Plants can reproduce sexually and/or 

asexually and include species that are perennial or annual in growth habit. A 
plants can be found to exist in a variety of habitats, including terrestrial and 
aquatic environments. The term "plant" includes a whole plant, plant cell, 
plant protoplast, plant calli, plant seed, plant organ, plant tissue, and other 

25 parts of a whole plant. 

As used herein, reproductive mode with reference to a plant refers to 
any and all methods by which a plant produces progeny. Reproductive 
modes include, but are not limited to, sexual and asexual reproduction. 
Plants may produce progeny by one or multiple reproductive modes. Sexual 

30 reproduction can include union of cells derived from haploid gametophytes 
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(e.g. f eggs produced from ovules and sperm produced from pollen in seed 
plants) to form diploid zygotes. Zygotes may be formed from gametophytes 
from different plants or from gametophytes of the same plant {e.g., through 
self-fertilization). Asexual reproduction can occur when offspring are 
5 produced through modifications of the sexual life cycle that do not include 
meiosis and syngamy. For example, when vascular plants reproduce 
asexually, they may do so by vegetative reproduction, such as budding, 
branching, and tillering, or by producing spores or seed genetically identical 
to the sporophytes that produced them. 

10 As used herein, stable maintenance of chromosomes occurs when at 

least about 85%, preferably 90%, more preferably 95%, of the cells retain 
the chromosome. Stability is measured in the presence of a selective agent. 
Preferably these chromosomes are also maintained in the absence of a 
selective agent. Stable chromosomes also retain their structure during cell 

15 culturing, suffering no unintended intrachromosomal nor interchromosomal 
rearrangements. 

As used herein, BrdU refers to 5-bromodeoxyuridine, which during 
replication is inserted in place of thymidine. BrdU is used as a mutagen; it 
also inhibits condensation of metaphase chromosomes during cell division. 

20 As used herein, ribosomal RNA (rRNA) is the specialized RNA that 

forms part of the structure of a ribosome and participates in the synthesis of 
proteins. Ribosomal RNA is produced by transcription of genes which, in 
eukaryotic cells, are present in multiple copies. In human cells, the 
approximately 250 copies of rRNA genes (i.e., genes which encode rRNA) 

25 per haploid genome are spread out in clusters on at least five different 

chromosomes (chromosomes 13, 14, 15, 21 and 22). In mouse cells, the 
presence of ribosomal DNA (rDNA, which is DNA containing sequences that 
encode rRNA) has been verified on at least 1 1 pairs out of 20 mouse 
chromosomes (chromosomes 5, 6, 7, 9, 11, 12, 15, 16, 17, 18, and 19) 

30 [see e.g., Rowe et a!. (1996) Mamm. Genome 7:886-889 and Johnson et al. 
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(1993) Mamm. Genome 4:49-52], In Arabidopsis thaliana the presence of 
rDNA has been verified on chromosomes 2 and 4 (18S, 5.8S, and 25S 
rDNA) and on chromosomes 3,4, and 5 (5S rDNA)[see The Arabidopsis 
Genome Initiative (2000) Nature 403:796-815]. In eukaryotic cells, the 
5 multiple copies of the highly conserved rRNA genes are located in a tandemly 
arranged series of rDNA units, which are generally about 40-45 kb in length 
and contain a transcribed region and a nontranscribed region known as 
spacer (i.e., intergenic spacer) DNA which can vary in length and sequence. 
In the human and mouse, these tandem arrays of rDNA units are located 

10 adjacent to the pericentric satellite DNA sequences (heterochromatin). The 
regions of these chromosomes in which the rDNA is located are referred to 
as nucleolar organizing regions (NOR) which loop into the nucleolus, the site 
of ribosome production within the cell nucleus. In higher plants, the rDNA is 
arragened in long tandem repeating units, similar to those of other higher 

15 eukaroytes. The 18S, 5.8S and 25S rRNA genes are clustered and are 
transcribed as one unit, while the 5S genes are located elsewhere in the 
genome. Between the 3' end of the 25S gene and the 5' end of the 1 8S 
gene is located a DNA spacer that ranges from 1 kb to greater than 1 2 kb in 
length for different species. Therefore, the rDNA repeat ranges from about 4 

20 kb to about 15 kb for different plant species [see, e.g., Rogers and Bendich 
(1987) Plant Mol. Biol. 5:509-520]. 

As used herein, a megachromosome refers to a chromosome that, 
except for introduced heterologous DNA, is substantially composed of 
heterochromatin. Megachromosomes are made up of an array of repeated 

25 amplicons that contain two inverted megareplicons bordered by introduced 
heterologous DNA [see, e.g., Figure 3 of U.S. Patent No. 6,077,697 for a 
schematic drawing of a megachromosome]. For purposes herein, a 
megachromosome is about 50 to 400 Mb, generally about 250-400 Mb. 
Shorter variants are also referred to as truncated megachromosomes [about 

30 90 to 1 20 or 1 50 Mb], dwarf megachromosomes [-1 50-200 Mb] and cell 
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lines, and a micro-megachromosome [-50-90 Mb, typically 50-60 Mb]. For 
purposes herein, the term megachromosome refers to the overall repeated 
structure based on an array of repeated chromosomal segments (amplicons) 
that contain two inverted megareplicons bordered by any inserted 
5 heterologous DNA. 

As used herein, transformation and transfection are used 
interchangeably to refer to the process of introducing nucleic acid 
introduced into cells. The terms transfection and transformation refer to the 
taking up of exogenous nucleic acid, e.g., an expression vector, by a host 

10 cell whether or not any coding sequences are in fact expressed. Numerous 
methods of introducing nucleic acids into cells are known to the ordinarily 
skilled artisan, for example, by Agrobacterium-rc\ed\a\ed transformation, 
protoplast transfection (including polyethylene glycol (PEG)-mediated 
transfection, electroporation, protoplast fusion, and microcell fusion), lipid- 

15 mediated delivery, liposomes, electroporation, microinjection, particle 

bombardment and silicon carbide whisker-mediated transformation {see, e.g., 
Paszkowski eta/. (1984) EMBO J. 5:2717-2722; Potrykus et aL (1985) Mo/. 
Gen. Genet. 733:169-177; Reich eta/. (1986) Biotechno/ogy 4:1001-1004; 
Klein eta/. (1987) Nature 327:70-73; U.S. Patent No. 6,143,949; 

20 Paszkowski et a/. (1989) in Cei/ Cu/ture and Somatic Cel/ Genetics of Plants, 
Vol. 6, Molecular Biology of Plant Nuclear Genes, eds. Schell, J and Vasil, 
L.K. Academic Publishers, San Diego, California, p. 52-68; and Frame et aL 
(1994) Plant J. 6:941-948), direct uptake using calcium phosphate [CaP04; 
see,e.^., Wigler eta/. (1979) Proc. Nat/. Acad. ScL U.S.A. 76:1373-1376], 

25 polyethylene glycol [PEG]-mediated DNA uptake, lipofection [see, e.g., 

Strauss (1996) Meth. Mol. Biol. 54:307-327], microcell fusion [see Lambert 
(1991) Proc. Natl. Acad. Sci. U.S.A. 56:5907-5911; U.S. Patent No. 
5,396,767, Sawford eta/. (1987) Somatic Cet/ Mol. Genet. 73:279-284; 
Dhar eta/. (1984) Somatic Cell MoL Genet. 70:547-559; and McNeill-Killary 

30 etal. (1995) Meth. Enzymol. 254:133-152], lipid-mediated carrier systems 
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[see, e.g., Teifel eta/. (1995) Biotechniques /S:79-80; Albrecht eta/. (1996) 
Ann. Hematol. 72:73-79; Holmen eta/. (1995) /n Vitro Cell Dev. Biol. Anim. 
37:347-351; Remy eta/. (1994) Bioconjug. Chem. 5:647-654; Le Bolch et 
a/. (1995) Tetrahedron Lett. 36:6681-6684; Loeff ler et al. (1993) Meth. 
5 Enzymo/. 2/7:599-6181 or other suitable method. Successful transfection is 
generally recognized by detection of the presence of the heterologous nucleic 
acid within the transfected cell, such as, for example, any visualization of the 
heterologous nucleic acid or any indication of the operation of a vector within 
the host cell. 

10 As used herein, injected refers to the microinjection (use of a small 

syringe, needle, or pipette) of nucleic acid into a cell. 

As used herein, gene therapy involves the transfer or insertion of 
nucleic acid molecules into certain cells, which are also referred to as target 
cells, to produce products that are involved in preventing, curing, correcting, 

15 controlling or modulating diseases, disorders and/or deleterious conditions. 
The nucleic acid is introduced into the selected target cells in a manner such 
that the nucleic acid is expressed and a product encoded thereby is 
produced. Alternatively, the nucleic acid may in some manner mediate 
expression of DNA that encodes a therapeutic product. This product may be 

20 a therapeutic compound, which is produced in therapeutically effective 

amounts or at a therapeutically useful time. It may also encode a product, 
such as a peptide or RNA, that in some manner mediates, directly or 
indirectly, expression of a therapeutic product. Expression of the nucleic 
acid by the target cells within an organism afflicted with a disease or 

25 disorder thereby enables modulation of the disease or disorder. The nucleic 
acid encoding the therapeutic product may be modified prior to introduction 
into the cells of the afflicted host in order to enhance or otherwise alter the 
product or expression thereof. 

For use in gene therapy, cells can be transfected in vitro, followed by 

30 introduction of the transfected cells into an organism. This is often referred 
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to as ex vivo gene therapy. Alternatively, the cells can be transfected 
directly in vivo within an organism. 

As used herein, a therapeutically effective product is a product that 
effectively ameliorates or eliminates the symptoms or manifestations of an 
5 inherited or acquired disease or disorder or that cures said disease or disorder 
in an organism. For example, therapeutically effective products include a 
product that is encoded by heterologous DNA expressed in a diseased 
organism and a product produced from heterologous DNA in a host cell and 
to which a diseased organism is exposed. 

10 As used herein, a transgenic plant refers to a plant (e.g., a plant ceil, 

tissue, organ or whole plant) containing heterologous or foreign nucleic acid 
or in which the expression of a gene naturally present in the plant has been 
altered. Heterologous nucleic acid within a transgenic plant may be 
transiently or stably maintained within the plant. Stable maintenance of 

15 heterologous nucleic acid may be maintenance of the nucleic acid through 
one or more, or two or more, or five or more, or ten or more, or 25 or more, 
or 50 or more or 60 or more cell divisions. A transgenic plant may contain 
heterologous nucleic acid in one cell, multiple cells or all cells. A transgenic 
plant may produce progeny that contain or do not contain the heterologous 

20 nucleic acid. 

As used herein, a promoter, with respect to a region of DNA, refers to 
a sequence of DNA that contains a sequence of bases that signals RNA 
polymerase to associate with the DNA and initiate transcription of messenger 
RNA (mRNA) from a template strand of the DNA. A promoter thus generally 

25 regulates transcription of DNA into mRNA. 

As used herein, operative linkage of heterologous DNA to regulatory 
and effector sequences of nucleotides, such as promoters, enhancers, 
transcriptional and translational stop sites, and other signal sequences refers 
to the relationship between such DNA and such sequences of nucleotides. 

30 For example, operative linkage of heterologous DNA to a promoter refers to 
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the physical relationship between the DNA and the promoter such that the 
transcription of such DNA is initiated from the promoter by an RNA 
polymerase that specifically recognizes, binds to and transcribes the DNA in 
reading frame. 

5 As used herein, isolated, substantially pure nucleic acid, such as, for 

example, DNA, refers to nucleic acid fragments purified according to 
standard techniques employed by those skilled in the art, such as that found 
in Maniatis eta/. [(1982) Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, NY]. 

10 As used herein, expression refers to the transcription and/or 

translation of nucleic acid. For example, expression can be the transcription 
of a gene into an RNA molecule, such as a messenger RNA (mRNA) 
molecule. Expression may further include translation of an RNA molecule 
into peptides, polypeptides, or proteins. If the nucleic acid is derived from 

15 genomic DNA, expression may, if an appropriate eukaryotic host cell or 
organism is selected, include splicing of the mRNA. With respect to an 
antisense construct, expression may refer to the transcription of the 
antisense DNA. 

As used herein, vector or plasmid refers to discrete elements that are 
20 used to introduce heterologous nucleic acids into cells for either expression 
of the heterologous nucleic acid or for replication of the heterologous nucleic 
acid. Selection and use of such vectors and plasmids are well within the 
level of skill of the art. 

As used herein, substantially homologous DNA refers to DNA that 
25 includes a sequence of nucleotides that is sufficiently similar to another such 
sequence to form stable hybrids under specified conditions. 

It is well known to those of skill in this art that nucleic acid fragments 
with different sequences may, under the same conditions, hybridize 
detectably to the same "target" nucleic acid. Two nucleic acid fragments 
30 hybridize detectably, under stringent conditions over a sufficiently long 
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hybridization period, because one fragment contains a segment of at least 
about 14 nucleotides in a sequence which is complementary (or nearly 
complementary) to the sequence of at least one segment in the other nucleic 
acid fragment. If the time during which hybridization is allowed to occur is 
5 held constant, at a value during which, under preselected stringency 

conditions, two nucleic acid fragments with exactly complementary base- 
pairing segments hybridize detectably to each other, departures from exact 
complementarity can be introduced into the base-pairing segments, and base- 
pairing will nonetheless occur to an extent sufficient to make hybridization 

10 detectable. As the departure from complementarity between the base-pairing 
segments of two nucleic acids becomes larger, and as conditions of the 
hybridization become more stringent, the probability decreases that the two 
segments will hybridize detectably to each other. 

Two single-stranded nucleic acid segments have "substantially the 

15 same sequence," within the meaning of the present specification, if (a) both 
form a base-paired duplex with the same segment, and (b) the melting 
temperatures of said two duplexes in a solution of 0.5 X SSPE differ by less 
than 10°C. If the segments being compared have the same number of 
bases, then to have "substantially the same sequence", they will typically 

20 differ in their sequences at fewer than 1 base in 10. Methods for determining 
melting temperatures of nucleic acid duplexes are well known [see, e.g. , 
Meinkoth and Wahl (1984) Anal. Biochem . 138 :267-284 and references 
cited therein]. 

As used herein, a nucleic acid probe is a DNA or RNA fragment that 
25 includes a sufficient number of nucleotides to specifically hybridize to DNA or 
RNA that includes identical or closely related sequences of nucleotides. A 
probe may contain any number of nucleotides, from as few as about 10 and 
as many as hundreds of thousands of nucleotides. The conditions and 
protocols for such hybridization reactions are well known to those of skill in 
30 the art as are the effects of probe size, temperature, degree of mismatch, 
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salt concentration and other parameters on the hybridization reaction. For 
example, the lower the temperature and higher the salt concentration at 
which the hybridization reaction is carried out, the greater the degree of 
mismatch that may be present in the hybrid molecules. 
5 To be used as a hybridization probe, the nucleic acid is generally 

rendered detectable by labelling it with a detectable moiety or label, such as 
32 P, 3 H and 14 C, or by other means, including chemical labelling, such as by 
nick-translation in the presence of deoxyuridylate biotinylated at the 5'- 
position of the uracil moiety. The resulting probe includes the biotinylated 

10 uridylate in place of thymidylate residues and can be detected (via the biotin 
moieties) by any of a number of commercially available detection systems 
based on binding of streptavidin to the biotin. Such commercially available 
detection systems can be obtained, for example, from Enzo Biochemicals, 
Inc. (New York, NY). Any other label known to those of skill in the art, 

1 5 including non-radioactive labels, may be used as long as it renders the probes 
sufficiently detectable, which is a function of the sensitivity of the assay, the 
time available (for culturing cells, extracting DNA, and hybridization assays), 
the quantity of DNA or RNA available as a source of the probe, the particular 
label and the means used to detect the label. 

20 Once sequences with a sufficiently high degree of homology to the 

probe are identified, they can readily be isolated by standard techniques, 
which are described, for example, by Maniatis et al. [(1982) Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY]. 

25 As used herein, conditions under which DNA molecules form stable 

hybrids and are considered substantially homologous are such that DNA 
molecules with at least about 60% complementarity form stable hybrids. 
Such DNA fragments are herein considered to be "substantially 
homologous". For example, DNA that encodes a particular protein is 

30 substantially homologous to another DNA fragment if the DNA forms stable 
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hybrids such that the sequences of the fragments are at least about 60% 
complementary and if a protein encoded by the DNA retains its activity. 

For purposes herein, the following stringency conditions are defined: 
1) high stringency: 0.1 x SSPE, 0.1% SDS, 65 °C 
5 2) medium stringency: 0.2 x SSPE, 0.1% SDS, 50°C 

3) low stringency: 1.0 x SSPE, 0.1% SDS, 50°C 
or any combination of salt and temperature and other reagents that result in 
selection of the same degree of mismatch or matching. 

As used herein, all assays and procedures, such as hybridization 
10 reactions and antibody-antigen reactions, unless otherwise specified, are 
conducted under conditions recognized by those of skill in the art as 
standard conditions. 

A. Amplification of Chromosomal Segments and Use Thereof in the 
Generation of Artificial Chromosomes 

15 The methods, cells and artificial chromosomes provided herein are 

produced by virtue of the discovery of the existence of a higher-order 
replication unit (megareplicon) of the centromeric region, including the 
pericentric DNA, of a chromosome. This megareplicon is delimited by a 
primary replication initiation site (megareplicator), and appears to facilitate 

20 replication of the centromeric heterochromatin, and, most likely, 

centromeres. Integration of heterologous nucleic acid into the megareplicator 
region, or in close proximity thereto, initiates a large-scale amplification of 
megabase-size chromosomal segments. Products of such amplification may 
be used as artificial chromosomes or in the generation of artificial 

25 chromosomes as described herein. 

Included among the DNA sequences that may provide a 
megareplicator are the rDNA units that give rise to ribosomal RNA (rRNA). In 
plants and animals, particularly mammals such as mice and humans, these 
rDNA units can contain specialized elements, such as the origin of replication 

30 (or origin of bidirectional replication, i.e., OBR, in mouse) and amplification 
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promoting sequences (APS) and amplification control elements (ACE) [see, 
e.g., with respect to plant rDNA, U.S. Patent Nos. 6,096,546 (to Raskin) and 
6,100,092 {to Borysyuk et a/.); PCT International Application Publication No. 
WO99/66058; Genbank Accession no. Y08422 (containing the central AT- 
5 rich region of a tobacco rDNA intergenic spacer); Borysyuk eta/. (1997) 
Plant Mol. Biol. 35:655-660); Borysyuk etal.. (2000) Nature Biotechnology 
75:1303-1306; Hernandez et ai. (1993) EMBO J. 72:1475-1485; Van't Hof 
and Lamm (1992) Plant Mol. Biol. 20:377-382; Hernandez et ai. (1988) Plant 
Mol. Biol. 70:413-322; and with respect to mammalian rDNA, Gogel et al. 

10 (1996) Chromosoma 704:511-518; Coffman et al. (1993) Exp. Cell. Res. 

209:123-132; Little et al. (1993) Mol. Cell. Biol. 73:6600-6613; Yoon et al. 
(1995) Mol. Cell. Biol. 75:2482-2489; Gonzalez and Sylvester (1995) 
Genomics 27:320-328; Miesfeld and Arnheim (1982) Nuc. Acids Res. 
70:3933-3949; Maden et al. (1987) Biochem. J. 246:519-527]. 

15 As described herein, without being bound by any theory, specialized 

elements such as these may facilitate replication and/or amplification of 
megabase-size chromosomal segments in the de novo formation of 
chromosomes, such as the artificial chromosomes described herein, in cells. 
These specialized elements are typically located in the nontranscribed 

20 intergenic spacer region upstream of the transcribed region of rDNA. The 
intergenic spacer region may itself contain internally repeated sequences 
which can be classified as tandemly repeated blocks and nontandem blocks 
(see e.g., Gonzalez and Sylvester (1995) Genomics 27:320-328). In mouse 
rDNA, an origin of bidirectional replication may be found within a 3-kb 

25 initiation zone centered approximately 1.6 kb upstream of the transcription 
start site (see, e.g., Gogel et al. (1996) Chromosoma 704:511-518). The 
sequences of these specialized elements tend to have an altered chromatin 
structure, which may be detected, for example, by nuclease hypersensitivity 
or the presence of AT-rich regions that can give rise to bent DNA structures. 
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Sequences of intergenic spacer regions of plant rDNA include, but are 
not limited to, sequences contained in GenBank Accession numbers S70723 
(from the 5S rDNA of barley (Hordeum vulgare)), AF013103 and X03989 
(from maize (Zea mays)), X65489 (from potato (So/anum tuberosum)), 
5 X52265 (from tomato (Lycopersicon esculentum)) , AF177418 (from 

Arabidopsis neglecta), AF 177421 and AF 17422 (from Arabidopsis halieri), 
A71562, X15550, and X52631 (from Arabidopsis thaliana; see Gruendler et 
a/. (1991) J. MoL Biol. 227:1 209-1 222 and Gruendler et al. (1989) Nucleic 
Acids Res. 7 7:6395-6396), X54194 (from rice (Oryza sativa)) and Y08422 

10 and D76443 (from tobacco (Nicotiana tabacum). Sequences of intergenic 
spacer regions of plant rDNA further include sequences from rye (see Appels 
et aL (1986) Can. J. Genet. Cytol. 25:673-685), wheat (see Barker et aL 
(1988) J. Mol. Biol. 207:1-17 and Sardana and Flavell (1996) Genome 
35:288-292), radish (see Delcasso-Tremousaygue et al. (1988) Eur. J. 

15 Biochem. 172:767-776), Vicia faba and Pisum sativum (see Kato et aL 

(1990) Plant Mol. Biol. 74:983-993), mung bean (see Gerstner et al. (1988) 
Genome 30:723-733; and Schiebel et al. (1989) Mol. Gen. Genet. 278:302- 
307), tomato (see Schmidt-Puchta et aL (1989) Plant MoL Biol. 73:251- 
253), Hordeum bulbosum (see Procunier et al. (1990) Plant MoL Biol. 

20 75:661-663) and Lens culinaris Medik., and other legume species (see 
Fernandez et aL (2000) Genome 43:597-603). Nucleic acids containing 
intergenic spacer sequences from plants can be obtained by nucleic acid 
amplification of DNA from plant cells using oligonucleotide primers 
corresponding to the 3' end of the conserved 25S mature rRNA encoding 

25 region and the 5' end of the conserved 18S mature rRNA encoding region 
(see PCT Application Publication No. W098/13505). 

An exemplary sequence encompassing a mammalian origin of 
replication is provided in GENBANK accession no. X82564 at about positions 
2430-5435. Exemplary sequences encompassing mammalian amplification- 

30 promoting sequences include nucleotides 690-1060 and 1 105-1530 of 
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GENBANK accession no. X82564 and are also provided in PCT Application 
Publication No. WO 97/40183. Exemplary sequences encompassing plant 
amplification-promoting sequences (APS) include those provided in U.S. 
Patent No. 6,100,092. 
5 In human rDNA, a primary replication initiation site may be found a 

few kilobase pairs upstream of the transcribed region and secondary initiation 
sites may be found throughout the nontranscribed intergenic spacer region 
(see, e.g., Yoon et at. (1995) MoL Cell. Biol. 75:2482-2489). A complete 
human rDNA repeat unit is presented in GENBANK as accession no. U 13369. 

10 Another exemplary sequence encompassing a replication initiation site may 
be found within the sequence of nucleotides 35355-42486 in GENBANK 
accession no. U 13369 particularly within the sequence of nucleotides 
37912-42486 and more particularly within the sequence of nucleotides 
37912-39288 of GENBANK accession no. U 13369 (see Coffman era/. 

15 (1993) Exp. Cell. Res. 205:123-132). 

B. Preparation of Plant Artificial Chromosomes 

Cell lines containing artificial chromosomes can be prepared by 
transforming cells, preferably a stable cell line, with heterologous nucleic acid 
and identifying cells that contain an artificial chromosome as described 

20 herein. The artificial chromosome is a chromosomal structure that is distinct 
from any chromosome that existed in the cell prior to introduction of the 
heterologous nucleic acid. A cell containing an artificial chromosome may be 
identified using a variety of procedures, alone or in combination, as described 
in detail herein. In particular embodiments of the methods described herein, 

25 the heterologous nucleic acid contains a sequence that targets the nucleic 
acid to an amplifiable region of a chromosome in the cell, such as, for 
example, the pericentric heterochromatin and/or rDNA, A variety of targeting 
sequences are provided herein. 

Prior to analyzing transformed cells for the presence of an artificial 

30 chromosome, the cells to be analyzed may be enriched with artificial 
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chromosome-containing cells using a variety of techniques depending on the 
heterologous nucleic acid that was introduced into the host cell to initiate 
generation of the artificial chromosomes. For example, if nucleic acid 
encoding a selectable marker was included in the heterologous nucleic acid, 
5 cells containing the marker may be selected for analysis. If the selectable 
marker is one that confers resistance to a cytotoxic agent, e.g., bialaphos, 
hygromycin or kanamycin, the transformed cells may be cultured under 
selective conditions which include the agent. Cells surviving growth under 
selective conditions are then analyzed for the presence of artificial 

10 chromosomes. If the selectable marker is a readily detectable reporter 

molecule, such as, for example, a fluorescent protein, the transformed cells 
may be selected on the basis of fluorescent properties. For example, cells 
containing the fluorescent protein may be isolated from nontransformed cells 
using a fluorescence-activated cell sorter (FACS). 

15 In analyzing transformed cells for the presence of artificial 

chromosomes, it is also possible to identify cells that have a multicentric, 
typically dicentric, chromosome, formerly multicentric (typically dicentric) 
chromosome, minichromosome and/or heterochromatic structures, such as a 
megachromosome and a sausage chromosome. If cells containing 

20 multicentric chromosomes or formerly mulitcentric (typically formerly 
dicentric) chromosomes are initially selected, these cells can then be 
manipulated, if need be, as described herein to produce the 
minichromosomes and other artificial chromosomes, particularly the 
heterochromatic artificial chromosomes and other segmented, repeat region- 

25 containing artificial chromosomes, as described herein. 

1. Cells used in the generation of plant artificial chromosomes 

Any cells harboring plant centromere-containing chromosomes may be 
used in the generation of plant artificial chromosomes (PACs). Such cells 
30 include, but are not limited to, plant cells, protoplasts, and cells that are 
hybrid cells of one or more plant species. Preferred cells are those that 
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harbor plant centromere-containing chromosomes and are readily susceptible 
to the introduction of heterologous nucleic acids therein. 

Cells for use in the generation of plant artificial chromosomes include 
cells that harbor acrocentric plant chromosomes. Examples of acrocentric 
5 plant chromosomes include chromosomes 2 and 4 of the plant Arabidopsis 
thai/ana (see, e.g., Mayer et aL (1999) Nature 402:769-777; Murata et al. 
(1997) The Plant Journal 72:31-37; The Arabidopsis Genome initiative 
(2000) Nature 405:796-815), four acrocentric chromosome pairs in 
Hefianthus annuus (sunflower; see Schrader et aL (1997) Chromosome Res. 

10 5:451-456), two pairs of acrocentric chromosomes in domesticated pepper 
plant (Capsicum annuum) and a nearly acrocentric chromosome in lentil 
plant. In particular embodiments of the methods described herein, cells 
harboring acrocentric plant chromosomes containing rDNA are used in 
generating plant artificial chromosomes. 

15 Plant species from which cells may be obtained include, but are not 

limited to, vegetable crops, fruit and vine crops, field plants, bedding plants, 
trees, shrubs, and other nursery stock. Examples of vegetable crops include 
artichokes, kohlrabi, arugula, leeks, asparagus, lettuce, bok choy, malanga, 
broccoli, melons (e.g., muskmelon, watermelon, crenshaw, honeydew, 

20 cantaloupe), brussel sprouts, cabbage, cardoni, carots, napa, cauliflower, 

okra, onions, celery, parsley, chick peas, parsnips, chicory, Chinese cabbage, 
peppers, collards, potatoes, cucumber plants, pumpkins, cucurbits, radishes, 
dry bulb onions, rutabaga, eggplant, salsify, escarole, shallots, endive, garlic, 
spinach, green onions, squash, greens, beet, sweet potatoes, Swiss chard, 

25 horseradish, tomatoes, kale, turnips and spices. Fruit and vine crops include 
apples, apricots, cherries, nectarines, peaches, pears, plums, prunes, quince, 
almonds, chestnuts, filberts, pecans, pistachios, walnuts, citrus, blueberries, 
boysenberries, cranberries, currants, loganberries, raspberries, strawberries, 
blackberries, grapes, avocados, bananas, kiwi, persimmons, pomegrante, 

30 pineapple, tropical fruits, pomes, melon, mango, papaya and lychee. 



WO 2002/096923 



PCT/US2002/017451 



-55- 

Field crop plants include evening primrose, meadow foam, corn, 

maize, hops, jojoba, peanuts, rice, safflower, small grains (barley, oats, rye, 

wheat, and others) sorghum, tobacco, kapok, leguminous plants (beans, 

lentils, peas, soybeans), oil plants (canola, rape, mustard, poppy, olives, 

5 sunflowers, coconut, castor oil plants, cocoa beans, groundnuts), fibre plants 

(cotton, flax, hemp, jute), lauraceae (cinnamon, camphor) and plants such as 

coffee, sugarcane, tea and natural rubber plants. Other examples of plants 

include bedding plants such as flowers, cactus, succulents and ornamental 

plants, as well as trees such as forest (broad-leaved trees and evergreens, 

10 such as conifers), fruit, ornamental and nut-bearing trees, shrubs, algae, 

moss, and duckweed. 

2. Heterologous nucleic acids for use in generating plant artificial 
chromosomes 

a. Selectable markers 

15 The heterologous nucleic acid that is introduced into a cell in the 

generation of artificial chromosomes as described herein may include nucleic 
acid encoding a selectable marker. Any nucleic acid that includes a 
selectable marker sequence may be introduced into cells harboring plant 
centromere-containing chromosomes for the generation of plant artificial 

20 chromosomes. Examples of selectable markers include, but are not limited 
to, DNA encoding a product that confers resistance to a cytotoxic or 
cytostatic agent and DNA encoding a readily detectable product, such as a 
reporter protein. 

(1) Nucleic acids encoding products that confer 
25 resistance to a selection agent 

Examples of selectable markers include the dihydrylfolate reductase 

(dhfr) gene, hygromycin phosphotransferase genes, the phosphinothricin 

acetyl transferase gene (bar gene) and neomycin phosphotransferase genes. 

Selectable markers that can be used in animal, e.g., mammalian cells include, 

30 but are not limited to the thymidine kinase gene and the cellular adenine- 

phosphribosyltransferase gene. 
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Of particular interest for purposes herein are nucleic acid selectable 
markers that, upon expression in the host cell, confer antibiotic or herbicide 
resistance to the cell, sufficient to provide for the maintenance of 
heterologous nucleic acids in the cell, and which facilitate the transfer of 
5 artificial chromosomes containing the marker DNA into new host cells. 
Examples of such markers include DNA encoding products that confer 
cellular resistance to hygromycin, kanamycin, G418, bialaphos, Basta, 
nriethotrexate, glyphosate, and puromycin. For example, neo (or nptll) 
provides kanamycin resistance and can be selected for using kanamycin, 

10 G418, paromomycin and other agents [see, e.g., Messing and Vierra (1982) 
Gene 73:259-268; and Bevan eta/. (1983) Nature 304:184-187]; bar from 
Steptomyces hygroscopicus, which encodes the enzyme phosphinothricin 
acetyl transferase (PAT) confers bialaphos, glufosinate, Basta or 
phosphinothricin resistance [see e.g. , White eta/. (1990) Nuc. Acids Res. 

15 75:1062; Spencer eta/. (1990) Theor. Appt. Genet. 73:625-631; Vickers et 
a/. (1996) P/ant Mo/. Biol. Reporter 74:363-368; and Thompson et al. (1987) 
EMBO J. 5:2519-2523]; the hph gene which confers resistance to the 
antibiotic hygromycin (see, e.g., Blochinger and Diggelmann, Mot. Cell. Biol. 
4:2929-2931); a mutant EPSP synthase protein [see Hinchee et al. (1988) 

20 Bio/techno/ 5:915-922] confers glyphosate resistance (see also U.S. Patent 
Nos. 4,940,935 and 5,188,642); and a nitrilase such as bxn from Klebsiella 
ozaenae confers resistance to bromoxynil [see Stalker et al. (1988) Science 
242:419-42]. DNA encoding cystathionine gamma-synthase (CGS) can be 
used as a marker that confers resistance to ethionine (see PCT Application 

25 Publication No. WO 00/55303). Examples of markers that can be used in 
animal, e.g., mammalian cells, include but are not limited to DNA encoding 
products that confer cellular resistance to streptomycin, zeocin, 
chloramphenicol and tetracycline. 

(2) Reporter Molecules 
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Nucleic acids encoding reporter molecules may also be included in the 
nucleic acid that is introduced into a recipient cell in the generation of 
artificial chromosomes. Reporter genes provide a means for identifying cells 
and chromosomes into which heterologous nucleic acids have been 
5 transferred and further provide a means for assessing whether or not, and to 
what extent, transferred DNA is expressed. 

Nucleic acids encoding reporter molecules that may be used in 
monitoring transfer and expression of heterologous nucleic acids into cells, 
particularly plant cells include, but are not limited to, nucleic acid encoding /?- 

10 glucuronidase (GUS) or the uidA gene product, which is an enzyme for which 
various chromogenic substrates are known [see Novel and Novel (1973) MoL 
Gen. Genet. 720:3 19-335; Jefferson et al. (1986) Proc. Natl. Acad. Sci. 
USA 33:8447-8451; US Patent No. 5,268,463; commercially available from 
Clontech Laboratories, Palo Alto, CA], DNA from an R-locus gene, which 

15 encodes a product that regulates the production of anthocyanin pigments 
(red color) in plant tissues [see, e.g., Dellaporta et ai. (1988) In 
"Chromosome Structure and Function: Impact of New Concepts, 18th 
Stad/er Genetics Sympsium" 1 7:263-282], nucleic acid encoding ^-lactamase 
[Sutcliffe (1978) Proc. Natl. Acad. Sci. U.S.A. 75:3737-3741] which is an 

20 enzyme for which various chromogenic substrates are known (e.g., PADAC, 
a chromogenic cephalosporin), DNA from a xy/E gene [see, e.g., Zukowsky 
etal. (1983) Proc. Natl. Acad. Sci. U.S.A. 30:1101-1105], which encodes a 
catechol dioxygenase that can convert chromogenic catechols; nucleic acid 
encoding a-amylase [see, e.g., Ikuta etal. (1990) Bio/technoL 3:241-242], 

25 nucleic acid encoding tyrosinase [see, e.g., Katz et al. (1983) J. Gen. 
Microbiol. 725:2703-2714], an enzyme capable of oxidizing tyrosine to 
DOPA and dopaquinone which in turn condenses to form the readily 
detectable compound melanin, nucleic acid encoding /?-galactosidase, an 
enzyme for which there are chromogenic substrates, nucleic acid encoding 

30 luciferase (lux) gene [see, e.g., Ow etal. (1986) Science 234:856-859] 
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which allows for bioluminesence detection, nucleic acid encoding aequorin 
[see, e.g., Prasher et al. (1985) Biochem. Biophy. Res. Commun. 726:1259- 
1268] which may be employed in calcium-sensitive bioluminescence 
detection, nucleic acid encoding a green fluorescent protein (GFP) [see, e.g., 
5 Sheen et al. (1995) Plant J. 5:777-784; Haselhoff et al. (1997) Proc. Natl. 
Acad. Sci. U.S.A. 34:2122-2127; Hasseloff and Amos (1995) Trends Genet 
7 7:328-329; Reichel et al. (1996) Proc. Natl. Acad. Scl. U.S.A. 33:5888- 
5893; Tian et al. (1 997) Plant Cell Rep. 76:267-271; Prasher et al. (1992) 
Gene 7 7 7:229-233; Chalfie et al. (1994) Science 263:802; PCT Application 

10 Publication Nos. W097/41228 and WO 95/07463; and commercially 

available from Clontech Laboratoreis, Palo Alto, CA), nucleic acid encoding a 
red or blue fluorescent protein (RFP or BFP, respectively), or nucleic acid 
encoding chloramphenicol acetyltransf erase (CAT). 

Enhanced GFP (EGFP) is a mutant of GFP with a 35-fold increase in 

15 fluorescence. This variant has mutations of Ser to Thr at amino acid 65 and 
Phe to Leu at position 64 and is encoded by a gene with optimized human 
codons (see, e.g., U.S. Patent No. 6,054,312). EGFP is a red-shifted variant 
of wild-type GFP (Yang et al. (1996) Nucl. Acids Res. 24:4592-4593; Haas 
etaL (1996) Curr. Biol. 6:315-324; Jackson et al. (1990) Trends Biochem. 

20 75:477-483) that has been optimized for brighter fluorescence and higher 
expression in mammalian cells (excitation maximum = 488 nm; emission 
maximum = 507 nm). EGFP encodes the GFPmutl variant (Jackson (1990) 
Trends Biochem. 75:477-483) which contains the double-amino-acid 
substitution of Phe-64 to Leu and Ser-65 to Thr. Sequences flanking EGFP 

25 have been converted to a Kozak consensus translation initiation site (Huang 
etal. (1990) Nucleic Acids Res. 18: 937-947) to further increase the 
translation efficiency in eukaryotic cells. 

Nucleic acid from the maize R gene complex can also be used as 
nucleic acid encoding a reporter molecule. The R gene complex in maize 

30 encodes a protein that acts to regulate the production of anthocyanin 
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pigments in most seed and plant tissue. Maize strains can have one, or as 

many as four, R alleles which combine to regulate pigmentation in a 

developmental and tissue-specific manner. Thus, an R gene introduced into 

such cells will cause the expression of a red pigment and, if stably 

5 incorporated, can be visually scored as a red sector. If a maize line carries 

dominant alleles for genes encoding for the enzymatic intermediates in the 

anthocyanin biosynthetic pathway (C2, A1, A2, Bz1 and Bz2), but carries a 

recessive allele at the R locus, the transformation of any cell from that line 

with R will result in red pigment formation. Exemplary lines include 

10 Wisconsin 22 which contains the rg-Stadler allele and TR1 12, a K55 

derivative which is r-g, b, PI. Alternatively, any genotype of maize can be 

utilized if the C1 and R alleles are introduced together. 

b. Promoters and other sequences that influence gene 
expression 

15 Expression of nucleic acid encoding a selectable marker (or any 

heterologous nucleic acid) in a recipient cell can be regulated by a variety of 
promoters. Promoters for use in regulating transcription of DNA in cells, 
particularly plant cells, include, but are not limited to, the nopaline synthase 
(NOS) and octopine synthase (OCS) promoters; cauliflower mosaic virus 

20 (CaMV) 19S and 35S promoters, the light-inducible promoter from the small 
subunit of ribulose bis-phosphate carboxylase (ssRUBISCO, an abundant 
plant polypeptide), the mannopine synthase (MAS) promoter [see, e.g., 
Velten et al. (1984) EMBO J. 5:2723-2730; and Velten and Schell (1985) 
Nuc. Acids Res. 73:6981-6998], the rice actin promoter, the ubiquitin 

25 promoter, for example, from Z. mays (see e.g., PCT Application Publication 
No. WO00/60061), Arabidopsis thaiiana UBI 3 promoter [see e.g., Norris et 
al. (1993) Plant Mol. Biol. 22:895-906] and the chemically inducible PR-1 
promoter from tobacco or Arabidopsis (see e.g., U.S. Patent No. 5,689,044). 
Selection of a suitable promoter may include several considerations, 

30 for example, recipient cell type (such as, for example, leaf epidermal cells. 
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mesophyll cells, root cortex cells), tissue- or organ-specific {e.g., roots, 
leaves or flowers) expression of genes linked to the promoter, and timing and 
level of expression (as may be influenced by constitutive vs. regulatable 
promoters and promoter strength). 
5 Additional sequences that may also be included in the nucleic acid 

containing a selectable marker include, but are not restricted to, transcription 
terminators and extraneous sequences to enhance expression such as 
introns. A variety of transcription terminators may be used which are 
responsible for termination of transcription beyond a coding region and 

10 correct polyadenylation. Appropriate transcription terminators include those 
that are known to function in plants such as, for example, the CaMV 35S 
terminator, the tmi terminator, the nopaline synthase terminator and the pea 
rbcS E9 terminator, all of which may be used in both monocotyledonous and 
dicotyledonous plants. 

15 Numerous sequences have been found to enhance gene expression 

from within the transcriptional unit and these sequences can be used in 
conjunction with selectable marker and other genes to increase expression of 
the genes in plant cells. For example, various intron sequences such as 
introns of the maize Adhl gene have been shown to enhance expression, 

20 particularly in monocotyledonous cells. In addition, a number of non- 
translated leader sequences derived from viruses are also known to enhance 
exprssion, and these are particularly effective in dicotyledonous cells. 

c. Nucleic acids containing targeting sequences 
Development of a multicentric, particularly dicentric, chromosome 

25 typically is effected through integration of heterologous nucleic acid into 

heterochromatin, such as the pericentric heterochromatin, near or within the 
centromeric regions of chromosomes and/or into rDNA sequences. Thus, the 
development of artificial chromosomes may be facilitated by targeting the 
heterologous nucleic acid for integration into these regions, such as by 

30 introducing DNA, including, but not limited to, rDNA {e.g., rDNA intergenic 
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spacer sequence), satellite DNA ( pericentric DNA and lambda phage DNA, 
into the recipient host cell. The targeting sequence may be introduced alone 
or with other nucleic acids, including but not limited to selectable markers. 
For example, a targeting sequence can be linked to a selectable marker. 
5 Examples of plant pericentric DNA and satellite DNA include, but are 

not limited to, pericentromeric sequences on tomato chromosome 6 [see, 
e.g., Weide et al. (1998) Mol. Gen. Genet. 255:190-197], satellite DNA of 
soybean [see, e.g., Morgante et al. (1997) Chromosome Res. 5:363-373; 
and Vahedian et at. (1995) Plant Mol. Biol. 25:857-862], pericentromeric 
10 DNA of Arabidopsis thaliana (see, e.g., Tutois et al. (1999) Chromosome 
Res. 7:143-156], satellite DNA of arabidopsis thaliana (GenBank accession 
nos. AB033593 and X58104), pericentric DNA of the chickpea [Cicer 
arietinum L.; see e.g., Staginnus et al. (1999) Plant Mol. Biol. 35:1037- 
1050], satellite DNA on the rye B chromosome [see, e.g., Langdon et al. 
15 (2000) Genetics 754:869-884], subtelomeric satellite DNA from Silene 
latifolia [see, e.g., Garrido-Ramos et al. (1999) Genome 42:442-446] and 
satellite DNA in the Saccharum complex [see, e.g., Alix et al. (1998) 
Genome 4 7:854-864]. 

Examples of rDNA targeting sequences include nucleic acids from 
20 plant and animal rDNA. Plant rDNA sequences include, but are not limited 
to, sequences contained in GENBANK Accession numbers D16103 [from 
rDNA of carrot (Daucus carota)], M23642 and M1 1585 [from rDNA encoding 
24S rRNA of rice (Oryza sativa)], M26461 [from from rDNA encoding 18S 
rRNA of rice (Oryza sativa)], M16845 [from rDNA encoding 17S, 5.8S and 
25 25S rRNA of rice (Oryza sativa)] t X82780 and X82781 [from rDNA encoding 
5S rRNA of potato (Solanum tuberosum}], AJ131 161, AJ131 162, 
AJ131163, AJ131164, AJ131165, AJ131166 and AJ131167 [from rDNA 
encoding 5S rRNA of tobacco (Nicotiana tabacum], L36494 and U31016 
through U31030 [from rDNA encoding 5S rRNA of barley (Hordeum 
30 spontaneum)], U31004 through U31015 and U31031 [from rDNA encoding 
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5S rRNA of barley (Hordeum bufbosum)], Z11759 [from rDNA encoding 5.8S 
rRNA of barley (Hordeum vufgare)], X16077 (from rDNA encoding 18S rRNA 
of Arabidopsis thaliana), M65137 (rDNA encoding 5S rRNA of Arabidopsis 
thaliana), AJ232900 (from rDNA encoding 5.8S rRNA of Arabidopsis 
5 thaiiana) and X52320 (from Arabidopsis thaliana genes for 5.8S and 25S 
rRNA with an 18S rRNA fragment). 

Intergenic spacer regions of plant rDNA include, but are not limited to 
sequences contained in GENBANK Accession numbers S70723 (from the 5S 
rDNA of barley (Hordeum vufgare)), AF013103 and X03989 (from maize 

10 (Zea mays)), X65489 (from potato (Soianum tuberosum)), X52265 (from 
tomato (Lycopersicon esculentum)) , AF177418 (from Arabidopsis neglecta), 
AF177421 and AF17422 (from Arabidopsis halleri), A71562, X15550, 
X52631, U43224, X52320, X52636 and X52637 (from Arabidopsis 
thaliana; see Gruendler et aL (1991) J. Mol. Biol. 227:1209-1222 and 

15 Gruendler et aL (1989) Nucleic Acids Res. 7 7:6395-6396), X54194 [from 
rice (Oryza sativa)] Y08422 and D76443 [from tobacco (Nicotiana 
tabacum)], AJ243073 [from wheat (Triticum boeoticum)] and X07841 [from 
wheat (Triticum aestivum)]. Sequences of intergenic spacer regions of plant 
rDNA further include sequences from rye [see Appels et at, (1986) Can. J. 

20 Genet. Cytol. 23:673-685], wheat [see Barker et aL (1988) J. Mol. Biol. 

201 A-M and Sardana and Flavell (1996) Genome 33:288-292], radish [see 
Delcasso-Tremousaygue et al. (1988) Eur. J. Biochem. 172:161-716], Vicia 
faba and Pisum sativum [see Kato et al. (1990) Plant Mol. Biol. 74:983-993], 
mung bean [see Gerstner et al. (1988) Genome 30:723-733; and Schiebel et 

25 al. (1989) Mol. Gen. Genet. 273:302-307], tomato [see Schmidt-Puchta et 
al. (1989) Plant Mol. Biol. 73:251-253], Hordeum bulbosum [see Procunier et 
al. (1990) Plant Mol. Biol. 75:661-663], Lens culinaris Medik., and other 
legume species [see Fernandez et al. (2000) Genome 43:597-603] and 
tobacco [see U.S. Patent Nos. 6,100,092 and 6,096,546 and PCT 

30 Application Publication No. WO99/66058; Borysyuk et aL (1997) Plant MoL 
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Biol. 35:655-660); Borysyuk eta/. (2000) Nature Biotechnology 75:1303- 
1306]. 

Mammalian rDNA sequences include, but are not limited to, DNA of 
GENBANK accession no. X82564 and portions thereof, the DNA of 
5 GENBANK accession no. U 13369 and portions thereof and DNA sequences 
provided in PCT Application Publication No. WO97/40183 (particularly SEQ. 
ID. NOS. 18-24 of WO97/40183). A particular vector for use in directing 
integration of heterologous nucleic acid into chromosomal rDNA is pTERPUD 
(see PCT Application Publication No. WO97/40183). Satellite DNA 

10 sequences can also be used to direct the heterologous DNA to integrate into 
the pericentric heterochromatin. For example, vectors pTEMPUD and 
pHASPUD, which contain mouse and human satellite DNA, respectively (see 
PCT Application Publication No. WO97/40183), are examples of vectors that 
may be used for introduction of heterologous nucleic acid into cells for de 

15 novo chromosome formation leading to artificial chromosomes. 

3. Methods for introduction of heterologous nucleic acids into host 
cells 

Any methods known in the art for introducing heterologous nucleic 
acids into host cells may be used in the methods of preparing artificial 

20 chromosomes. The particular method used may depend on the type of cell 
into which the heterologous nucleic acid is being transferred. For example, 
methods for the physical introduction of nucleic acids into plant cells, for 
example, protoplasts and plant cells in culture, include, but are not limited to 
polyethylene glycol (PEG)-mediated DNA uptake, electroporation, lipid- 

25 mediated delivery, including liposomes, calcium phosphate-mediated DNA 
uptake, microinjection, particle bombardment, silicon carbide whisker- 
mediated transformation and combinations of these methods, for example 
methods utilizing combinations of calcium phosphate and PEG for DNA 
uptake or methods utilizing a combination of electroporation, PEG and heat 

30 shock (see, e.g., U.S. Patent Nos. 5,231,019 and 5,453,367). Physical 
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methods such as these are known in the art and are effective in introducing 
DNA into a variety of dicotyledonous and monocotyledonous plants [see, 
e.g., Paszkowski et al. (1984) EMBO J. 3:2717-2722; Potrykus et al. (1985) 
Mot. Gen. Genet. 755:169-177; Reich et al. (1986) Biotechnology 4:1001- 
5 1004; Klein et al. (1987) Nature 327:70-73; U.S. Patent No. 6,143,949; 
Paszkowski et al. (1989) in Cell Culture and Somatic Cell Genetics of Plants, 
Vol. 6, Molecular Biology of Plant Nuclear Genes, eds. Schell, J and Vasil, 
L.K. Academic Publishers, San Diego, California, p. 52-68; and Frame et al. 
(1994) Plant J. 6:941-948]. 

10 In addition to these methods for the introduction of nucleic acids into 

plant cells based on physically, mechanically or chemically meidated 
processes, it is possible to introduce nucleic acids into plant cells by 
biological methods, such as those utilizing Agrobacterium . In this method, 
nucleic acid sequences located adjacent to T-DNA border repeats can be 

15 inserted into the genome of a plant cell, typically dicotyledonous plant cells, 
by utilizing the encoded function for DNA transfer found in the genus 
Agrobacterium. This method has also been shown to work for some 
monocotyledonous plant cells, such as rice cells. 

Any method for introducing nucleic acids into plant cells can be used 

20 in the generation of artificial chromosomes, provided the method is capable 

of introducing the nucleic acid into an amplifiable region of a chromosome, 

for example, heterochromatin, and particularly in close proximity to a 

megareplicator region of a plant chromosome. 

a. Agrobacterium-medtated introduction of nucleic acids 
25 into plant cells 

Agrobacterium-med'iated transformation is particularly well-suited for 

transformation of dicotyledons because of its high efficiency of 

transformation and its broad utility with many different species, including 

tobacco, tomato (see, e.g., European Patent Application no. 0 249 432), 

30 sunflower, cotton (see, e.g., European Patent Application no. 0 317 511), 
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oilseed rape, potato, soybean, alfalfa and poplar (see, e.g., U.S. Patent No. 
4,795,855) (see also PCT Application Publication no. WO87/07299 with 
respect to transformation of Brassica). Agrobacterium-med'xaXed 
transformation has also been used to transfer nucleic acids into 
5 monocotyledonous plants. Agrobacterium-med'\ated transformation of 

Chlorophytum capense and Narcissus cv "Paperwhite" [see, e.g., Hooykaas- 
Van Slogteren et al. (1984) Nature 31 7:763-764], corn and wheat [see, e.g., 
U.S. Patent Nos. 5,164,310, 5,187,073 and 5,177,010 and Mooney et al. 
(1991) Plant Cell, Tissue, Organ Culture 25:209-218], rice [see, e.g., Raineri 

10 etaL (1990) Bio/Technology 5:33-38 and Chan et al. (1993) Plant Mol. Biol. 
22:491-506] and barley [see, e.g., Tingay etaL (1997) The Plant J. 
17: 1369-1 376 and Qureshi et al. (1998) Proc. 42nd Conference of 
Australian Society for Biochemistry and Molecular Biology, September 28- 
October 1 , 1 998, Adelaide Australia] has been reported. 

15 Agrobacterium-med'\ated delivery of nucleic acids is based on the 

capacity of certain Agrobacterium strains to introduce a part of their Ti 
(tumor-inducing) plasmid, i.e., the transforming DNA or T-DNA, into plant 
cells and to integrate this T-DNA into the genome of the cells. The part of 
the Ti plasmid that is transferred and integrated is delineated by specific DNA 

20 sequences, the left and right T-DNA border sequences. The natural T-DNA 
sequences between these border sequences can be replaced by foreign DNA 
[see, e.g., European Patent Publication 116 718 and Deblaere etaL (1987) 
Meth. EnzymoL 153:277-293). 

When Agrobacterium is used for transformation, the heterologous 

25 nucleic acid being transferred typically is cloned into a plasmid that contains 
T-DNA border regions and is replicated independently of the Ti plasmid 
(referred to as the binary vector system) or the heterologous nucleic acid is 
inserted between the T-DNA borders of the Ti plasmid (referred to as the co- 
integrate method). In co-integrate methods, these vectors are be integrated 

30 into the Ti or Ri plasmid by homologous recombination owing to sequences 
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that are homologus to sequences within the T-DNA region of the Ti or Ri 
plasmid. The Ti or Ri plasmid also contains the vir region necessary for 
transfer of the T-DNA. 

Intermediate vectors cannot replicate in Agrobacteria. The 
5 intermediate vector can be transferred into Agrobacterium by means of a 
helper plasmid (conjugation, see Fraley et al. (1983) Proc. Natl. Acad. Sci. 
USA 50:4803). This method, typically referred to as triparental mating, 
introduces the heterologous nucleic acid sequence into the bacterium and 
allows for selection of a homologous recombination event that produces the 

10 desired Agrobacterium genotype. The triparental mating procedure typically 
employs Escherichia coli carrying the recombinant intermediate vector and a 
helper E. coli strain which carries a plasmid that is able to mobilize the 
recombinant intermediate vector to the target Agrobacterium strain. A 
modified Ti or Ri plasmid is obtained from the transfer and selection process, 

15 which contains a heterologous nucleic acid sequence located within the T- 
DNA region. The resultant Agrobacterium strain is capable of transferring 
the heterologous nucleic acid to plant cells. 

Binary vectors can replicate both in E. coli and Agrobacterium, They 
typically contain a selection marker gene and a linker or polylinker which are 

20 flanked by the right and left T-DNA border regions and can be transformed 
directly into Agrobacterium [see, e.g., Hofgen and Wilmitzer (1988) Nuc. 
Acids. Res. 76:9877 and Holsters et al. (1978) Mol. Gen. Genet. 7ff3:181- 
187] or introduced through triparental mating. The Agrobacterium host cell 
contains a plasmid carrying a vir region needed for transfer of the T-DNA into 

25 a plant cell [see, e.g., White in Plant Biotechnology t eds. Kung, S. and 

Arntzen, C.J., Butterworth Publishers, Boston, Mass., (1989) p. 3-34 and 
Fraley in Plant Biotechnology, eds. Kung, S. and Arntzen, C.J., Butterworth 
Publishers, Boston, Mass., (1989) p. 395-407]. 

Agrobacterium-med\ated transformation typically involves the transfer 

30 of a binary vector carrying the heterologous nucleic acid of interest to an 
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appropriate Agrobacterium strain, which may depend on the complement of 
vir genes carried by the host Agrobacterium strain either on a co-resident Ti 
plasmid or chromosomally (see, e.g., Uknes eta/. (1993) Plant Ce//5:159- 
169). The transfer of a recombinant binary vector to Agrobacterium is 
5 acomplished by a triparental mating procedure using Eschreichia coli carrying 
the recombinant binary vector, a helper E. coli strain which carries a plasmid 
which is able to mobilize the recombinant binary vector to the target 
Agrobacterium strain. Alternatively, the recombinant binary vector can be 
transferred to Agrobacterium by DNA transformation (see, e.g., Hofgen & 

10 Willmitzer (1988) Nuc. Acids. Res. 76:9877). 

Many vectors are available for transfer of nucleic acids into 
Agrobacterium tumefaciens [see, e.g., Rogers et at. (1987) Methods in 
EnzymoL 753:253-277]. These typically carry at least one T-DNA border 
sequence and include vectors such as pBIN19 [see, e.g., Bevan (1984) Nuc. 

15 Acids. Res. 72:8711-8721]. Typical vectors suitable for Agrobacterium 

transformation include the binary vectors pCIB200 and pCIB2001, as well as 
the binary vector pCIBIO and hygromycin selection derivatives thereof (see, 
e.g., U.S. Patent No. 5,639,949). Other vectors that can be employed are 
the pCambia vectors (see www.cambia.org), including, for example, 

20 pCambia 3300 and pCambia 1302 (GenBank Accession No. AF234298). 

A particularly useful Ti plasmid cassette vector for the transformation 
of dicotyledonous plants contains the enhanced CaMV35S promoter (EN35S) 
and the 3' end, including polyadenylation signals, of a soybean gene 
encoding the a subunit of /?-conglycinin. Between these two elements is a 

25 multilinker containing multiple restriction sites for the insertion of genes of 
interest (see, e.g., U.S. Patent No. 6,023,013). The vector can contain a 
segment of pBR322 which provides an origin of replication in E. coli and a 
region for homologous recombination with the disarmed T-DNA in 
Agrobacterium strain ACO; the oriV region from the broad host range 

30 plasmid RK1; the streptomycin/spectinomycin resistance gene from Tn7; and 
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a chimeric NPTII gene, containing the CaMV35S promoter and the nopaline 
synthase (NOS) 3' end, which provides kanamycin resistance in transformed 
plant cells. Optionally, the enhanced CaMV35S promoter may be replaced 
with the 1.5 kb mannopine synthase (MAS) promoter (see, e.g., Velton et al. 
5 (1984) EMBO J. 3:2723-2730). After incorporation of a DNA construct into 
the vector, it is introduced into A. tumefaciens strain ACO which contains a 
disarmed Ti plasmid. Cointegrate Ti plasmid vectors are selected and 
subsequentally may be used to transform a dicotyledenous plant. 
Transformation of the target plant species by recombinant 

10 Agrobacterium usually involves co-cultivation of the Agrobacterium with 
explants from the plant and follows published protocols. Methods of 
inoculation of the plant tissue vary depending upon the plant species and the 
Agrobacterium delivery system. The plant tissue can be either protoplast, 
callus or organ tissue, depending on the plant species. A widely used 

15 approach is the leaf disc procedure which can be performed with any tissue 
explant that provides a good source for initiation of whole plant 
differentiation (see, e.g., Horsch et al. in Plant Molecular Biology Manual A5, 
Kluwer Academic Publishers, Dordrecht (1988) p. 1-9 and U.S. Patent No. 
6,136,320). The addition of nurse tissue may be desirable under certain 

20 conditions. There are multiple choices of Agrobacterium strains (including, 
but not limited to, A. tumefaciens and A. rhizogenes) and plasmid 
construction strategies that can be used to optimize genetic transformation 
of plants. Transformed tissue carrying an antibiotic or herbicide resistance 
marker present between the binary plasmid and T-DNA borders can be 

25 regenerated on selectable medium. 

A. tumefaciens ACO is a disarmed strain similar to pTiB6SE (see 
Fraley et al. (1985) Bio/Technology 3:629-635). For construction of ACO, 
the starting Agrobacterium strain was A208 which contains a nopaline-type 
Ti plasmid. The Ti plasmid was disarmed in a manner similar to that 

30 described by Fraley et al. (1985) Bio/Technology 3:629-635) so that 
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essentially all of the native T-DNA was removed except for the left border 
and a few hundred base pairs of T-DNA inside the left border. The remainder 
of the T-DNA extending to a point just beyond the right border was replaced 
with a piece of DNA including {from left to right) a segment of pBR322, the 
5 oriV region from plasmid RK2, and the kanamycin resistance gene from 
Tn601. The pBR322 and oriV segments are similar to these segments and 
provide a region of homology for cointegrate formation (see U.S. Patent No. 
6,023,013). Another useful strain of Agrobacterium is A. tumefaciens strain 
GV3101/pMP9O [see, e.g., Koncz and Schell (1986) Mol. Gen. Genet. 

10 204:383-396]. 

Advances in Agrobacterium~med\ated transfer allow introduction of 
larger segments of nucleic acids [see, e.g., Hamilton (1997) Gene 4:200(1- 
2):107-116; Hamilton eta/. (1996) Proc. Natl. Acad. Sci. U.S.A. 33:9975- 
9979; Liu et al. (1999) Proc. Natl. Acad. Sci. U.S.A. 36:6535-6540]. The 

15 vectors used in these methods are designed to have the characteristics of 
both bacterial artificial chromosomes (BACs) and binary vectors for 
Agrobacterium-medlated transformation. Therefore, somewhat larger DNA 
fragments cloned in the T-DNA region can be transferred into a plant genome 
by Agrobacterium. Binary bacterial artificial chromosome (BIBAC) vector 

20 BIBAC2 (see U.S. Patent No. 5,733,744; available from the Plant Science 
Center, Cornell University) and the transformation-competent bacterial 
artificial chromosome (TAC) vector pYLTAC7 (available from the Plant Cell 
Bank of the RIKEN Gene Bank, Tsukuba, Japan) are examples of the types of 
vectors that may be used in transferring larger segments of nucleic acids, 

25 particularly heterologous nucleic acids containing targeting and/or selectable 
marker sequences as described herein, into plants via Agrobacterium- 
mediated DNA transfer processes. 

Introduction of heterologous nucleic acids into plant cells without the 
use of Agrobacterium circumvents the requirements for T-DNA sequences in 

30 the transformation vector and consequently vectors lacking these sequences 
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can be utilized in addition to vectors containing T-DNA sequences. 
Techniques for nucleic acid transfer that do not rely on Agrobacterium 
include transformation via particle bombardment, direct DNA uptake (e.g., 
PEG, lipids, electroporation) and mechanical methods such as microinjection 
5 or silicon "whiskers". The choice of vector that may be used in introduction 
of heterologous nucleic acids into plant cells can involve largely on the 
preferred selection for the species being transformed. Typical vectors 
suitable for transformation without Agrobacterium include pCIB3064, 
pSOG19 and pSOG35 (see, e.g., U.S. Patent No. 5,639,949), or common 

10 plasmid, phage or cosmid vectors. 

b. Direct DNA Uptake 
Introduction of heterologous nucleic acids into plant cells may be 
achieved using a variety of methods that facilitate direct DNA uptake, 
including calcium phosphate precipitation, polyethylene glycol (PEG) 

15 treatment, electroporation, and combinations thereof [see, e.g., Potrykus et 
a/. (1985) MoL Gen. Genet. 733:183; Lorz eta/. (1985) MoL Gen. Genet. 
733:178; Frommefa/. (1985) Proc. Nat/. Acad. Set. U.S.A. 52:5824-5828; 
Uchimiya eta!. (1986) MoL Gen. Genet. 204:204; Callis eta/. (1987) Genes 
Dev. 7:1183-2000; Callis et a/. (1987) Nuc. Acids Res. 75:5823-5831; 

20 Marcotte et aL (1988) Nature 355:454, Toriyama eta/. (1988) 

Bio/Technology 6:1072-1074; Haim et at. (1985) MoL Gen. Genet. 733:161- 
168; Deshayes eta/. (1985) EMBO J. 4:2731-2737; Krens et aL (1982) 
Nature 296:72-74; Crossway eta/. (1986) MoL Gen. Genet. 20:179]. 

Typically, plant protoplasts are used for direct DNA uptake, or in some 

25 instances plant tissue that has been treated to remove a portion or the 

majority of the cell wall (see, e.g., PCT Publication No. W093/21335 and 
U.S. Patent No. 5,472,869). Removal of the cell wall is believed to facilitate 
entry of DNA into plant cells, although in some instances electroporation may 
be used to introduce DNA into specialized plant cells, e.g., electroporation of 

30 pollen, without first removing the cell wall. 
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Techniques for the preparation of callus and protoplasts from maize, 

transformation of protoplasts using PEG or electroporation, and the 

regeneration of maize plants from transformed protoplasts are found, for 

example, in European Patent Application nos. 0 292 435 and 0 392 225 and 

5 PCT Application Publication no. WO93/07278. Transformation of rice can 

also be undertaken by direct gene transfer techniques utilizing protoplasts 

[see, e.g., Zhang eta/. (1988) Plant Ceil Rep. 7:379-384; Shimamoto et al. 

(1989) Nature 335:274-277; Datta et al. (1990) Biotechnology 5:736-740], 

The regeneration of fertile transgenic barley by direct DNA transfer to 

10 protoplasts is described, for example, by Funatsuki et al. [(1995) Theor. 

Appf. Genet. S/:707-71 2]. Other plant species, including tobacco and 

Arabidopsis, may also serve as sources of protoplasts for use in introduction 

of heterologous nucleic acids into plant cells. 

c. Particle bombardment-mediated introduction of nucleic 
15 acids into plant cells 

Microprojectile bombardment of plant cells can be an effective method 

for the introduction of nucleic acids into plant cells. In these methods, 

nucleic acids are carried through the cell wall and into the cytoplasm on the 

surface of small, typically metal, particles [see, e.g., Klein et al. (1987) 

20 Nature 327:70; Klein et al. (1988) Proc. Natl. Acad. Scl. U.S.A. 55:8502- 
8505, Klein et al. in Progress in Plant Cellular and Molecular Biology, eds. 
Nijkamp, H.J.J., Van der Plas, J.H.W., and Van Aartrijk, J., Kluwer 
Academic Publishers, Dordrecht, (1988), p. 56-66; Seki et al. (1999) Mol. 
Biotechnol. 1 7:251-255; and McCabe et al. (1988) Bio/Technology 5:923- 

25 926]. Particles may be coated with nucleic acids and delivered into cells by 
a propelling force. Exemplary particles include those containing tungsten, 
gold or plantinum, as well as magnesium sulfate crystals. The metal 
particles can penetrate through several layers of cells and thus allow the 
transformation of cells within tissue explants. 
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In an illustrative embodiment (see, e.g., U.S. Patent No. 6,023,013] of 
a method for delivering nucleic acids into plant cells, e.g., maize cells, by 
acceleration, a Biolistics Particle Delivery System may be used to propel 
particles coated with DNA or cells through a screen, such as a stainless steel 
5 or Nytex screen, onto a filter surface covered with plant [e.g., corn) cells 
cultured in suspension. The screen disperses the particles so that they are 
not delivered to the recipient cells in large aggregates. The intervening 
screen between the projectile apparatus and the cells to be bombarded may 
reduce the size of projectile aggregates and may contribute to a higher 

10 frequency of transformation by reducing damage inflicted on the recipient 
cells by projectiles that are too large. 

For the bombardment, cells in suspension may be concentrated on 
filters or solid culture medium. Alternatively, immature embryos or other 
target cells may be arranged on solid culture medium. The cells to be 

15 bombarded are typically positioned at an appropriate distance below the 

macroprojectile stopping plate. If desired, one or more screens may also be 
positioned between the acceleration device and the cells to be bombarded. 

The prebombardment culturing conditions and bombardment 
parameters may be optimized to yield the maximum numbers of stable 

20 transformants. Both the physical and biological parameters for bombardment 
can be important in this technology. Physical factors include those that 
involve manipulating the DNA/microprojectile precipitate or those that affect 
the flight and velocity of either the macro- or microprojectiles. Biological 
factors include all steps involved in manipulation of cells before and 

25 immediately after bombardment, the osmotic adjustment of target cells to 
help alleviate the trauma associated with bombardment, and also the nature 
of the transforming nucleic acid, such as linearized DNA or intact supercoiled 
plasmids. 

Physical parameters that may be adjusted include gap distance, flight 
30 distance, tissue distance and helium pressure. In addition, transformation 
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may be optimized by adjusting the osmotic state, tissue hydration and 

subculture stage or cell cycle of the recipient cells. 

Techniques for transformation of A188-derived maize line using 

particle bombardment are desribed in Gordon-Kamm eta/. [(1990) Plant Cell 

5 2:603-618] and Fromm et al. [(1990) Biotechnology 5:833-839]. 

Transformation of rice may also be accomplished via particle bombardment 

[see, e.g., Christou et al. (1991) Biotechnology 3:957-962]. Particle 

bombardment may also be used to transform wheat [see, e.g., Vasil et aL 

(1992) Biotechnology 70:667-674 for transformation of cells of type C long- 

10 term regenerate callus; and Weeks et al. (1993) Plant Physiol. 102A011- 

1084 for transformation of wheat using particle bombardment of immature 

embryos and immature embryo-derived callus]. The production of transgenic 

barley using bombardment methods is described, for example, by Koprek et 

al. [(1996) Plant Sci. 7/3:79-91]. 

15 d. Electroporation-mediated introduction of nucleic acids 

into plant cells 

The application of brief, high-voltage electric pulses to a variety of 
animal and plant cells leads to the formation of nanometer-sized pores in the 
plasma membrane. Nucleic acids are taken directly into the cell cytoplasm 

20 either through these pores or as a consequence of the redistribution of 
membrane components that accompanies closure of the pores. 
Electroporation can be extremely efficient and can be used both for transient 
expression of cloned genes and for the establishment of cell lines that carry 
integrated copies of the gene of interest. 

25 Certain cell wall-degrading enzymes, such as pectin-degrading 

enzymes, may be employed to render the target recipient cells more 
susceptible to transformation by electroporation than untreated cells. 
Alternatively, recipient cells may be more susceptible to transformation by 
mechanical wounding. To effect transformation by electroporation, friable 

30 tissues such as a suspension culture of cells or embryonic callus may be 
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used or immature embryos or other organized tissues may be directly 
transformed [see, e.g., Fromm et al. (1986) Nature 3/3:791-793; and 
Neuman eta/. (1 982) EMBO J. 7:841-845]. 

e. Microinjection-mediated introduction of nucleic acids into 
5 plant cells 

In microinjection techniques, nucleic acids are mechanically injected 

directly into cells using very small micropipettes. For example, microinjection 

of protoplast cells with foreign DNA for transformation of plant cells has 

been reported for barley and tobacco [see, e.g., Holm et al. (2000) 

10 Transgenic Res. 9:21-32 and Schnorf eta!. Transgenic Res. 7:23-30]. 

f . Lipid-mediated introduction of nucleic acids into plant 
cells 

In lipid-mediated transfer, nucleic acids are contacted with lipids 
and/or encapsulated in lipid-containing structures, including but not limited to 

15 liposomes, and the liposome-containing nucleic acids are fused with plant 
protoplasts. The fusion can occur in the presence or absence of a fusogen, 
such as PEG. Lipid-mediated transformation of plant protoplasts has been 
reported [see e.g., Fraley and Papahadjopoulos (1982) Curr. Top. Microbiol. 
Immunol. 96: 171 -191; Deshayes et al. (1985) EMBO J. 4:2731-2737 and 

20 Spoerlein and Koop (1991) Theor. Appl. Genetics S3:1-5]. 

g. Other methods of introduction of nucleic acids into plant 
cells 

Other methods to physically introduce nucleic acid into plant cells may 
be used, including silicon carbide fibers ("whiskers") that are used to pierce 
25 plant cell walls thereby facilitating nucleic acid uptake, the use of sound 
waves to introduce holes in plant cell membranes to facilitate nucleic acid 
uptake (e.g., sonoporation) and the use of laser beams to open holes in cell 
membranes facilitating the entry of nucleic acids {e.g., laser poration). 

Nucleic acids may also be imbibed by hydrating plant tissue, providing 
30 another method for nucleic acid uptake into plant cells [see, e.g., Simon 
(1974) New Phytologist 37:377-420]. For example, nucleic acids may be 
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taken into cereal and legume seed embryos by inhibition [see, e.g.. Toepfer 

et aL (1989) The Plant Cell 1: 133-1 39]. 

4. Treatment of cells into which heterologous nucleic acids have 
been introduced 

5 Cells into which heterologous nucleic acids have been introduced may 

be analyzed for de novo formation of artificial chromosomes described herein 
such as may result from amplification of chromosomal segments occurring in 
connection with integration of heterologous nucleic acids into chromosomes. 
Typically, amplification occurs over multiple generations of cell division 

10 leading to the formation of detectable changes in chromosome structure. 
Therefore, transfected cells are typically cultured through multiple cell 
divisions, from about 5 to about 60, or about 5 to about 55, or about 10 to 
about 55, or about 25 to about 55, or about 35 to about 55 cell divisions 
following introduction of nucleic acid into a cell. Artificial chromosomes 

15 may, however, appear after only about 5 to about 15 or about 10 to about 
15 cell divisions. Cells into which heterologous nucleic have been introduced 
may be treated in a variety of ways prior to or during analysis thereof for the 
presence of artificial chromosomes. 

For example, cells into which nucleic acid encoding a selectable 

20 marker required for growth in the presence of a selection agent has been 
transferred can be treated as the exemplified cells herein to facilitate 
generation of multicentric chromosomes, and fragmentation thereof, and/or 
the generation of artificial chromosomes. The cells may be grown in the 
presence of an appropriate concentration of selection agent, which may be 

25 determined empirically by growing untransfected cells in varying 

concentrations of the agent and identifying concentrations sufficient to 
prevent cell growth and/or facilitate amplification of chromosomal segments. 
Transfected cells may be grown in selective media for numerous generations 
and cell lines can be established that contain the introduced nucleic acid. 

30 The concentration of selection agent may also be increased over several 
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generations to promote amplification of a region of a chromosome into which 
heterologous nucleic acid integrated. Transfected cells may also be treated 
to destabilize the chromosomes to facilitate generation and fragmentation of 
a multicentric, typically dicentric, chromosome. 
5 Additional heterologous nucleic acid, e.g., nucleic acid encoding a 

selectable marker, may also be introduced into the transfected cells to 
facilitate amplification of chromosomal segments, such as the pericentric 
heterochromatin, contained in, for example, a fragment released from a 
multicentric chromosome (e.g., a formerly dicentric chromosome), and 

10 generation of a heterochromatic artificial chromosome. The resulting 

transformed cells can then be grown in the presence of a selection agent, 
which may be a second agent {if the heterologous nucleic acid introduced 
into the transfected cells encodes a selectable marker different from any 
selectable marker encoded by heterologous nucleic acid initially transferred 

15 into the original host cells), with or without the first selection agent. 

Cells into which nucleic acids have been introduced may also be 
subjected to cell sorting. For example, protoplasts may be prepared from 
transfected plant cells or calli and subjected to sorting. If the sorting is 
conducted prior to chromosomal analysis of the cells for the presence of 

20 artificial chromosomes, it provides a population of transfected cells that may 
be enriched for artificial chromosomes and thus facilitates the subsequent 
chromosomal analysis of the cells. 

The sorting is based on the presence of a detectable marker in the 
cells, as provided for by the introduced nucleic acid, which can provide the 

25 basis for isolating such cells from cells that do not contain the heterologous 
nucleic acid. For example, the nucleic acid introduced into the plant cells 
may contain nucleic acid encoding a fluorescent protein, such as a green, red 
or blue fluorescent protein, which may be used for selection, by flow 
cytometry and other methods, of recipient cells that have taken up and 

30 express the nucleic acid at readily detected levels. 
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In an exemplary protocol, GFP fluorescence of transfected cell cultures 
may be monitored visually during culture using an inverted microscope 
equipped with epifluorescence illumination (Axiovert 25; Zeiss, (North York 
ON) and #41017 Endow GFP filter set (Chroma Technologies, Brattleboro, 
5 VT). Enrichment of GFP expressing populations can be carried out as 

follows. Cell sorting may be carried out, for example, using a FACS Vantage 
flow cytometer (Becton Dickinson Immunocytometry Systems, San Jose, 
CA) equipped with turbo-sort option and 2 Innova 306 lasers (Coherent, Palo 
Alto CA). For cell sorting a 70 //m nozzle can be used. The buffer can be 

10 changed to PBS (maintained at 20 p.s.i.). GFP may be excited with a 488 
nm laser beam and excitation detected in FL1 using a 500 EFLP filter. 
Forward and side scattering can be adjusted to select for viable cells. Gating 
parameters may be adjusted using untransfected cells as negative control 
and GFP CHO cells as positive control. 

15 For the first round of sorting, transfected cells may be harvested post- 

transfection {e.g., about 7-14 days post-transfection), converted to 
protoplasts, resuspended in about 10 ml of growth medium and sorted for 
GFP-expressing populations using parameters described above. GFP-positive 
cells may be dispensed into a volume of about 5-10 ml of protoplast medium 

20 while non-expressing cells are directed to waste. The expressing cells may 

be cultured. Plant cells or calli can then be analyzed, for fluorescence in-situ 

hybridization screening. 

5. Analysis of transformed cells and identification and 
manipulation of artificial chromosomes 

25 Cells into which nucleic acids have been introduced, and which may 

or may not have been further treated as described herein, may be analyzed 
for indications of amplification of chromosomal segments, the presence of 
structures that may arise in connection with amplification and de novo 
artificial chromosome formation and/or the presence of desired artificial 

30 chromosomes as described herein. Analysis of the cells typically involves 
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methods of visualizing chromosome structure, including, but not limited to, G- 
and C-banding, PCR, Southern blotting and FISH analyses, using techniques 
described herein and/or known to those of skill in the art. Such analyses can 
employ specific labelling of particular nucleic acids, such as satellite DNA 
5 sequences, heterochromatin, rDNA sequences and heterologous nucleic acid 
sequences, that may be subject to amplification. During analysis of 
transfected cells, a change in chromosome number and/or the appearance of 
distinctive, for example, by increased segmentation arising from amplification 
of repeat units, chromosomal structures will also assist in identification of 

10 cells containing artificial chromosomes. The following description of events 
and structures that may be observed in analyzing cells for evidence of 
chromosomal amplification and/or the presence of artificial chromosomes is 
intended to be illustrative of the observations and considerations that may 
occur in the analysis of cells of any type, including mammalian and plant 

1 5 cells. It should be recognized that numerous types of structures may be 
formed during amplification of chromosomal segments and treatment of the 
cells. Additional, yet related, structures and variations of these structures 
are contemplated herein and are recognizable based on the descriptions and 
teachings of the generation and identification of artificial chromosomes 

20 presented herein. Each structure can be further manipulated, for example 
using procedures described herein, to derive additional chromosomal 
structures and compositions. 

Typically, de novo centromere formation occurs in cells upon 
integration of heterologous nucleic acids into the cell chromosomes and 

25 amplification of chromosomal and heterologous nucleic acids. The 

integration and amplification that gives rise to de novo centromere formation 
typically occurs at the centromeric region of the short arm of a chromosome, 
typically an acrocentric chromosome. By employing methods such as 
chromosome-staining methods, including FISH and G-and C-banding, it may 

30 be possible to identify a chromosome at which the process occurs. 
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The amplification can lead to the formation of multicentric, typically 
dicentric, chromosomes. Because of the presence of two or more 
functionally active centromeres on the same chromosome, regular breakages 
occur between the centromeres. Such specific chromosome breakages can 
5 give rise to the appearance of a chromosome fragment carrying a neo- 

centrornere. The neo-centromere may be found on a minichromosome (neo- 
minichromosome), while a formerly dicentric chromosome may carry traces 
of the heterologous nucleic acid. 

a. The neo-minichromosome 

10 Breakage of a dicentric chromosome between the two functional 

centromeres can form at least two chromosomes, for example, a so-called 
minichromosome, and a formerly dicentric chromosome. Treatment of cells 
containing a dicentric chromosome, such as, for example, recloning, 
treatment with agents that destabilize the chromosomes, e.g., BrdU, and/or 

15 culturing under selective conditions, may facilitate breakage of the dicentric 
chromosome. Selection of transformed cells can yield cell lines containing a 
stable neo-minichromosome. The breakage of a multicentric, typically 
dicentric, chromosome in transformed cells, which separates the neo- 
centromere from the remainder of the endogenous chromosome, may occur, 

20 for example, in the G-band positive heterologous nucleic acid region as is 

suggested if traces of the heterologous nucleic acid sequences at the broken 
end of the formerly dicentric chromosome are observed. 

Multiple E-type amplification (amplification of euchromatin) may form a 
neo-chromosome, which separates from the remainder of the dicentric 

25 chromosome through a specific breakage between the centromeres of the 

dicentric chromosome. Inverted duplication of the fragment bearing the neo- 
centromere can result in the formation of a stable neo-minichromosome. The 
minichromosome is generally about at least 20-30 Mb in size. 

The presence of inverted chromosome segments can be associated 

30 with the chromosomes formed de novo at the centromeric region of a 
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chromosome. During the formation of the neo-minichromosome, the event 
leading to the stabilization of the distal segment of the chromosome that 
bears the duplicated neo-centromere may be the formation of its inverted 
duplicate. 

5 Although the neo-minichromosome typically carries only one functional 

centromere, both ends of the minichromosome can be heterochromatic, 
carrying, for example, satellite DNA sequences as discernable by in situ 
hybridization. Comparison of the G-band pattern of a chromosome fragment 
carrying the neo-centromere with that of a stable neo-minichromosome, can 

10 indicate that the neo-minichromosome is an inverted duplicate of the 
chromosome fragment that bears the neo-centromere. 

Cells containing a de novo-formed minichromosome, which contains 
multiple repeats of the heterologous nucleic acids, can be used as recipient 
cells in cell transfection. Donor nucleic acids, such as heterologous nucleic 

15 acids containing DNA encoding a desired protein and DNA encoding a 

second selectable marker, can be introduced into the cells and integrated into 
the de novo-formed minichromosomes. To facilitate integration into the de 
now-formed minichromosomes, the heterologous DNA may also contain 
sequences that are homologous to nucleic acids already present in the 

20 minichromosomes, which can, through homologous recombination, provide 
targeted integration into the minichromosome. Nucleic acids can also be 
integrated into the minichromosome through the use of site-specific 
recombinases by producing minichromosomes containing site-specific 
recombination sites as described herein. Integration can be verified by in situ 

25 hybridization and Southern blot analyses. Transcription and translation of 
heterologous DNA can be confirmed by primer extension, immunoblot 
analyses and reporter gene assays, if a reporter gene has been included in 
the heterologous DNA, using, for example, appropriate nucleic acid probes 
and/or product-specific antibodies. 
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The resulting engineered minichromosome that contains the heterolo- 
gous DNA can also be transferred, for example by cell fusion, into a recipient 
cell line to further verify correct expression of the heterologous DNA. 
Following production of the cells, metaphase chromosomes can be obtained, 
5 such as by addition of colchicine, and the minichromosomes purified using 
methods as described herein. The resulting minichromosomes can be used 
for delivery to specific cells of interest using any known method or methods 
for transferring heterologous nucleic acids into cells, particularly plant cells, 
and/or methods described herein. 

10 Thus, the neo-minichromosome is stably maintained in cells, replicates 

autonomously, and permits the persistent, long-term expression of genes 
under non-selective culture conditions, and in a whole, intact, regenerated 
plant. It also can contain megabases of heterologous known DNA that can 
serve as target sites for homologous recombination and integration of DNA 

15 of interest. The neo-minichromosome is, thus, a vector for the delivery and 
expression of nucleic acids to cells. 

Cell lines that contain artificial chromosomes, such as the 
minichromosome, the neo-chromosome, and the heterochromatic artificial 
chromosomes, are a convenient source of these chromosomes and can be 

20 manipulated, such as by cell fusion or production of microcells for fusion 
with selected cell lines, to deliver the chromosome of interest into a 
multiplicity of cell lines, including cells from a variety of different plant 
species. 

b. Heterochromatin-containing and predominantly 
25 heterochromatic artificial chromosomes 

Manipulation of cells containing a fragment released upon breakage of 

the dicentric chromosome {e.g., a formerly dicentric chromosome), for 

example, by introducing additional heterologous nucleic acids, including, for 

example, DNA encoding a second selectable marker and growth under 

30 selective conditions, can yield heterochromatic structures. Included among 
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such structures are compositions referred to as sausage chromosomes and 
megachromosomes. For example, a formerly dicentric chromosome may 
translocate to the end of another chromosome, such as an acrocentric 
chromosome. Additional heterologous nucleic acids added to cells containing 
5 a formerly dicentric chromosome can integrate into the pericentric 

heterochromatin of the formerly dicentric chromosome and be amplified 
several times with megabases of pericentric heterochromatic satellite DNA 
sequences forming a "sausage" chromosome carrying a newly formed 
heterochromatic chromosome arm. The size of this heterochromatic arm can 

10 vary, for example, between -150 and —800 Mb in individual metaphases. 
The chromosome arm can contain four to five satellite segments rich in 
satellite DNA, and evenly spaced integrated heterologous "foreign" DNA 
sequences. At the end of the compact heterochromatic arm of the sausage 
chromosome, a less condensed euchromatic terminal segment may be 

15 observed. By capturing a euchromatic terminal segment, this new 

chromosome arm is stabilized in the form of the "sausage" chromosome. In 
subclones of sausage chromosome-containing cell lines, the heterochromatic 
arm of the sausage chromosome may become unstable and show continuous 
intrachromosomal growth, particularly after treatment with BrdU and/or drug 

20 selection to induce further H-type amplification. In extreme cases, the 
amplified chromosome arm can exceed 500 Mb or even 1000 Mb in size 
(gigachromosome). Thus, the gigachromsome is a structure in which a 
heterochromatic arm has amplified but not broken off from a euchromatic 
arm. 

25 In situ hybridization with, for example, biotin-labeled subfragments of 

the added heterologous nucleic acids may show a hybridization signal only in 
the heterochromatic arm of the sausage chromosome, indicating that the 
heterologous nucleic acid sequences are localized in the pericentric 
heterochromatin. 



WO 2002/096923 



PCT/US2002/017451 



-83- 

Gene expression, however, may be possible in the heterochromatic 
environment of a sausage chromosome. The level of heterologous gene 
expression may be determined by Northern hybridization with a subfragment 
of the selectable marker gene. Reporter genes included in heterologous 
5 nucleic acids also provide a readily detectable product for use in evaluating 
gene expression in a sausage or other heterochromatic or predominantly 
heterochromomatic chromosome. Southern hybridization of DNA isolated 
from subclones of sausage chromosome-containing cells with subfragments 
of reporter (and selectable marker) genes can show a close correlation 
10 between the intensity of hybridization and the length of the sausage 
chromosome. 

Cell lines containing sausage chromosomes can be manipulated to 
yield additional heterochromatic structures and artificial chromosomes, 
including, for example, an artificial chromosome referred to as a 
15 megachromosome. Such manipulation includes fusion of the cell line with 
other cells and growth in the presence of one or more selection agents 
and/or Brdll. 

Cells with a structure, such as the sausage chromosome, can be 
selected and fused with a second cell line, including other plant and non- 
20 plant species [see, e.g., Dudits eta/. (1976) Heriditas 52:121-123 for the 
fusion of human cells with carrot protoplasts and Wiegand et al. (1987) J. 
Cell. ScL (Ft. 2^:145-149 for laser-induced fusion of plant protoplasts with 
mammalian cells] to eliminate other chromosomes that are not of interest. 
Structures such as sausage chromosomes formed during this process may be 
25 further manipulated, for example, by treating the cells with agents that 

destabilize chromosomes, e.g., BrdU, so that the heterochromatic arm forms 
a chromosome that is substantially heterochromatic (e.g., a 
megachromosome). Structures such as the gigachromosome in which the 
heterochromatic arm has amplified but not broken off from the euchromatic 
30 arm, may also be observed. Further manipulation, such as fusions and 
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growth in selective conditions and/or BrdU treatment or other such 
treatment, can lead to fragmentation of the megachromosome to form 
smaller chromosomes that have the amplicon as the basic repeating unit. 

If a cell with a sausage chromosome is selected, it can be treated with 
5 an agent, such as BrdU, that destabilizes the chromosome so that the 
heterochromatic arm forms a chromosome that is substantially 
heterochromatic (e.g., a megachromosome). Prior to treating the cell with 
BrdU, it can be fused with another cell line carrying chromosomes of another 
species, in order to eliminate chromosomes of the original host cell and 

10 obtain a cell in which the only chromosome from the host cell is the sausage 
chromosome. The resulting hybrid cells can be grown in the presence of 
multiple selection agents to select for those that carry the sausage 
chromosome. In situ hybridization with chromosome painting probes that 
detect chromosomes of both the host cell species and the species of cell to 

15 which the host cell was fused can provide an indication of the chromosomal 
make up of the hybrid cells. 

Cell lines containing a sausage chromosome can be treated with a 
destabilizing agent, such as BrdU, followed by growth in selective medium 
and retreatment with BrdU. The BrdU treatments appear to destabilize the 

20 genome, resulting in a change in the sausage chromosome as well. A cell 
population in which a further amplification has occurred will arise. In 
addition to the heterochromatic arm (which may, for example, be -100-150 
Mb) of the sausage chromosome, an extra centromere and another (for 
example, ~ 1 50-250 Mb) heterochromatic chromosome arm may be formed. 

25 By the acquisition of another euchromatic terminal segment, a new 
submetacentric chromosome [e.g., megachromosome) can form. 

Megachromosomes may also be produced through regrowth and 
establishment of sausage chromosome-containing cells in selective medium. 
Repeated BrdU treatment can produce cell lines that have a dwarf 

30 megachromosome (for example, about 150-200 Mb), a truncated 
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megachromosome (for example, about 90-120 Mb), or a micro- 
megachromosome (for example, about 50-90 Mb). Cell lines containing 
smaller truncated megachromosomes can be used to generate even smaller 
megachromosomes, e.g., -10-30 Mb in size. This may be accomplished, 
5 for example, by breakage and fragmentation of a micro-megachromosome 
through exposing the cells to X-ray irradiation, BrdU or telomere-directed in 
vivo chromosome fragmentation. 

Apart from the euchromatic terminal segments and the integrated 
foreign nucleic acid, the whole megachromosome, as well as other related 

10 types of predominantly heterochromatic artificial chromosomes, is 

constitutive heterochromatin. This can be demonstrated by C-banding of the 
megachromosome, which results in positive staining characteristic of 
constitutive heterochromatin. It can contain tandem arrays of satellite DNA. 
In a particular example, satellite DNA blocks are organized into a giant 

15 palindrome (amplicon) carrying integrated exogenous nucleic acid sequences 
at each end. It is of course understood that the specific organization and 
size of each component can vary among species, and also the chromosome 
in which the amplification event initiates. 

In general, a clear segmentation may be observed in one or more arms 

20 of an amplification-based chromosome. For example, a megachromosome 
may contain building units that are amplicons of, for example, — 30 Mb 
containing satellite DNA with the integrated "foreign" DNA sequences at 
both ends. The —30 Mb amplicons may be composed of two — 1 5 Mb 
inverted doublets of —7.5 Mb satellite DNA blocks, which are separated 

25 from each other by a narrow band of non-satellite sequences. The wider 
non-satellite regions at the amplicon borders may contain integrated, 
exogenous (heterologous) nucleic acid, while any narrow bands of non- 
satellite DNA sequences within the amplicons may be integral parts of the 
pericentric heterochromatin of the host chromosomes. The sizes of the 

30 building units of a megachromosome or other amplification-based 
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chromosome may vary depending on the species of the host chromosome 
from which the artificial chromosome was generated. 

Further BrdU treatment can produce cell and/or calli that include cells 
with a truncated megachromosome. The megachromosome can be further 
5 fragmented in vivo using a chromosome fragmentation vector to ultimately 
produce a chromosome that comprises a smaller stable replicable unit, for 
example, about 15 Mb-60 Mb, containing one to four megareplicons. 

Apart from the euchromatic terminal segments, the whole 
megachromosome is heterochromatic, and has structural homogeneity. 

10 Therefore, artificial chromosomes such as the megachromosome offer a 

unique possibility for obtaining information about the amplification process, 
and for analyzing some basic characteristics of the pericentric constitutive 
heterochromatin, as a vector for heterologous DNA, and as a target for 
further fragmentation. 

15 C. Isolation of Artificial Chromosomes 

The artificial chomosomes provided herein can be isolated by any 
suitable method known to those of skill in the art. Also, methods are 
provided herein for effecting substantial purification, particularly of the 
artificial chromosomes. 

20 Artificial chromosomes, may be sorted from endogenous 

chromosomes using any suitable procedures, and typically involve isolating 
metaphase chromosomes, distinguishing the artificial chromosomes from the 
endogenous chromosomes, and separating the artificial chromosomes from 
endogenous chromosomes. Such procedures will generally include the 

25 following basic steps for animal cells and protoplasts: (1) culture of a 
sufficient number of cells (typically about 2 x 10 7 mitotic cells) to yield, 
preferably on the order of 1 x 10 6 artificial chromosomes, (2) arrest of the 
cell cycle of the cells in a stage of mitosis, preferrably metaphase, using a 
mitotic arrest agent such as colchicine, (3) treatment of the cells, particularly 

30 by cell wall dissolution for plant cells and/or swelling of the cells in hypotonic 



WO 2002/096923 



PCT/US2002/017451 



-87- 

buffer, to increase susceptibility of the cells to disruption, (4) by application 
of physical force to disrupt the cells in the presence of isolation buffers for 
stabilization of the released chromosomes, (5) dispersal of chromosomes in 
the presence of isolation buffers for stabilization of free chromosomes, (6) 
5 separation of artificial chromosomes from endogenous chromosomes and 
(7) storage {and shipping if desired) of the isolated artificial chromosomes in 
appropriate buffers. Modifications and variations of the general procedure 
for isolation of artificial chromosomes, for example to accommodate different 
cell types with differing growth characteristics and requirements and to 

10 optimize the duration of mitotic block with arresting agents to obtain the 

desired balance of chromosome yield and level of debris, may be empirically 
determined (see Examples). 

Steps 1-5 relate to isolation of metaphase chromosomes. The 
separation of artificial from endogenous chromosomes (step 6) may be 

15 accomplished in a variety of ways. For example, the chromosomes may be 
stained with DNA-specific dyes such as Hoeschst 33258 and chromomycin 
A 3 and sorted into artificial chromosomes and endogenous chromosomes on 
the basis of dye content by employing fluorescence-activated cell sorting 
(FACS). 

20 Artificial chromosomes have been isolated by fluorescence-activated 

cell sorting (FACS). This method takes advantage of the nucleotide base 
content of the artificial chromosomes. In the case of predominantly 
heterochromatic artificial chromosomes, by virtue of their high 
heterochromatic DNA content, they will differ from any other chromosomes 

25 in a cell. In a particular embodiment, metaphase chromosomes are isolated 
and stained with base-specific dyes, such as Hoechst 33258 and 
chromomycin A3. Fluorescence-activated cell sorting will separate artificial 
chromosomes from the endogenous chromosomes. A dual-laser cell sorter 
(such as, for example, a FACS Vantage Becton Dickinson Immunocytometry 

30 Systems) in which two lasers were set to excite the dyes separately, allowed 
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a bivariate analysis of the chromosomes by base-pair composition and size. 
Cells containing such artificial chromosomes can be similarly sorted. 

Preparative amounts of artificial chromosomes (for example, 5 x 10 4 - 
5 x 10 7 chromosomes/ml) at a purity of 95% or higher can be obtained. The 
5 resulting artificial chromosomes are used for delivery to cells by methods 
such as, for example, microinjection, liposome-mediated transfer, and 
electroporation. 

Additional methods provided herein for isolation of artificial 
chromosomes from endogenous chromosomes include procedures that are 

10 particularly well suited for large-scale isolation of artificial chromosomes. In 
these methods, the size and density differences between artificial 
chromosomes and endogenous chromosomes are exploited to effect 
separation of these two types of chromosomes. To facilitate larger scale 
isolation of the artificial chromosomes, different separation techiniques may 

15 be employed such as swinging bucket centrifugation (to effect separation 

based on chromosome size and density) [see, e.g., Mendelsohn et aL (1968) 
J. Mol. Biol. 32:101-108], zonal rotor centrifugation (to effect separation on 
the basis of chromosome size and density) [see, e.g., Burki et aL (1973) 
Prep. Biochem. 3:157-182; Stubblefield et aL (1978) Biochem. Biophvs. Res. 

20 Commun. 83 :1404-1414, velocity sedimentation (to effect separation on the 
basis of chromosome size and shape) [see e.g., Collard et aL (1984) 
Cytometry 5:9-191. 

Affinity-, particularly immunoaffinity-, based methods for separation of 
ACs from endogenous chromosomes are also provided herein. For example, 

25 artificial chromosomes which are predominantly heterochromatin may be 
separated from endogenous chromosomes through immunoaffinity 
procedures involving antibodies that specifically recognize heterochromatin, 
and/or the proteins associated therewith, when the endogenous 
chromosomes contain relatively little heterochromatin. 
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Immuno-affinity purification may also be employed in larger scale 
artificial chromosomes isolation procedures. In this process, large 
populations of artificial chromosome-containing cells (asynchronous or 
mitotically enriched) are harvested en masse and the mitotic chromosomes 
5 (which can be released from the cells using standard procedures such as by 
incubation of the cells, such as freshly isolated protoplasts, in hypotonic 
buffer and/or detergent treatment of the cells in conjunction with physical 
disruption of the treated cells) are enriched by binding to antibodies that are 
bound to solid state matrices (e.g. column resins or magnetic beads). 

10 Antibodies suitable for use in this procedure bind to condensed centromeric 
proteins or condensed and DNA-bound histone proteins. For example, 
autoantibody LU851 (see Hadlaczky et aL (1989) Chromosoma 97:282-288), 
which recognizes mammalian centromeres, may be used for large-scale 
isolation of chromosomes prior to subsequent separation of artificial 

15 chromosomes from endogenous chromosomes using methods such as FACS. 
The bound chromosomes would be washed and eventually eluted for sorting. 

Immunoaffinity purification may also be used directly to separate 
artificial chromosomes from endogenous chromosomes. For example, in the 

20 case of artificial chromosomes that are predominantly heterochromatic, the 
artificial chromsomes may be generated in or transferred to (e.g., by 
microinjection or microcell fusion as described herein) a cell line that has 
chromosomes that contain relatively small amounts of heterochromatin, such 
as hamster cells (e.g., V79 cells or CHO-K1 cells). The predominantly 

25 heterochromatic artificial chromosomes are then separated from the 

endogenous chromosomes by utilizing anti-heterochromatin binding protein 
(Drosophila HP-1) antibody conjugated to a solid matrix. Such matrix 
preferentially binds artificial chromosomes relative to hamster chromosomes. 
Unbound hamster chromosomes are washed away from the matrix and the 

30 artificial chromosomes are eluted by standard techniques. Similarly, artificial 
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chromosomes of one species, e.g., a plant-derived artificial chromosome, 
may be separated from a background of endogenous chromosomes of 
another species, e.g., animal, such as mammalian, chromosomes, based on 
immunological differences of the two species, provided that antibodies that 
5 specifically recognize one species and not the other are available or can be 
generated. 

D. Generation of Artificial Chromosomes Through Assembly of 
Component Elements 

Artificial chromosomes can be constructed in vitro by assembling the 

10 structural and functional elements that contribute to a complete chromosome 
capable of stable replication and segregation alongside endogenous 
chromosomes in cells. The identification of the discrete elements that in 
combination yield a functional chromosome has made possible the in vitro 
assembly of artificial chromosomes. The process of in vitro assembly of 

15 artificial chromosomes, which can be rigidly controlled, provides advantages 
that may be desired in the generation of chromosomes that, for example, are 
required in large amounts or that are intended for specific use in transgenic 
organism systems. 

For example, in vitro assembly may be advantageous when efficiency 

20 of time and scale are important considerations in the preparation of artificial 
chromosomes. Because in vitro assembly methods do not involve extensive 
cell culture procedures, they may be utilized when the time and labor 
required to transform, feed, cultivate, and harvest cells used in de novo cell- 
based production systems is unavailable. 

25 Provided herein are in vitro assembly methods that include the joining 

of essential components, such as a centromere, telomere and an origin of 
replication, to yield an artificial chromosome, in particular, an artificial 
chromosome that functions in plants and that may contain components 
derived from plant chromosomes. Also provided are artificial chromosomes 

30 produced by the methods. Particular embodiments of the methods and 
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chromosomes include a megreplicator. The megareplicator may contain 
rDNA, for example, mammalian or plant rDNA. in vitro assembled artificial 
chromosomes may contain any amount of heterochromatic and/or 
euchromatic nucleic acid. For example, an in vitro assembled artificial 
5 chromosome may be substantially all heterochromatin, while still containing 
protein-encoding DNA, or may contain increasing amounts of euchromatic 
DNA, such that, for example, it contains about 10%, 20%, 30%, 40%, 
50%, 60%, 70%, 80%, 90% or greater than about 90% euchromatic DNA. 
in vitro assembly may also be rigorously controlled with respect to the 

10 exact manner in which the several elements of the desired artificial 

chromosome are combined and in what sequence and proportions they are 
assembled to yield a chromosome of precise specifications. This feature is 
of particular significance in the generation of plant artificial chromosomes 
containing one or more regions of segmentation as described herein with 

15 reference to amplification-based artificial chromosomes. For example, certain 
plant chromosome structures (such as acrocentric chromosomes and/or 
chromosomes containing adjacent regions of heterochromatin and rDNA) that 
may be desirable for use in the generation of particular types of plant 
artificial chromosomes via amplification-based methods as described herein 

20 may be limited in number or may not exist. These particular types of plant 
artificial chromosomes, e.g., certain predominantly heterochromatic plant 
artificial chromosomes, may also be generated via in vitro assembly of 
artificial chromosomes as described herein. 

For example, plant artificial chromosomes containing regions of 

25 repeated nucleic acid units that are predominantly heterochromatic may be 
assembled by joining essential chromosomal components and repeat regions, 
or may be generated from an in vitro assembled artificial chromosome via 
amplification of heterochromatic DNA contained within an in vitro assembled 
artificial chromosome. For generation of such chromosomes via amplification 

30 of heterochromatic DNA contained within an in vitro assembled artificial 
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chromosome, nucleic acids are introduced into a cell containing an in vitro 
assembled artificial chromosome and a resulting cell is selected that contains 
an artificial chromosome containing one or more regions of repeated nucleic 
acid units that are predominantly heterochromatic. The in vitro assembled 
5 artificial chromosome either contains a megareplicator to faciliate 

amplification of chromosomal DNA in connection with integration of nucleic 
acid into the chromosome or megareplicator-containing DNA is included in 
the nucleic acid that is integrated into thee in vitro assembled artificial 
chromosome. 

10 The following describes the processes involved in the assembly of 

artificial chromosomes in vitro, utilizing a megachromosome as exemplary 
starting material. 

1 . Identification and isolation of the components of the artificial 
chromosome 

15 The chromosomes provided herein are elegantly simple chromosomes 

for use in the identification and isolation of components to be used in the in 
vitro assembly of expression systems or artificial chromosomes. The ability 
to purify artificial chromosomes to a very high level of purity, as described 
herein, facilitates their use for these purposes. For example, the 

20 megachromosome, particularly truncated forms thereof, serve as starting 
materials. With respect to the construction of an artificial chromosome 
containing at least some mammalian cell derived components, possible 
starting materials can be obtained from, for example, cell lines such as 1 B3 
and mM2C1 , which are derived from H1D3 (deposited at the European 

25 Collection of Animal Cell Culture (ECACC) under Accession No. 96040929). 
With respect to the construction of an artificial chromosome containing at 
least some plant cell derived components, possible starting materials include 
cells containing PACs, e.g., megachromosomes, generated as described 
herein. 
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For example, the mM2C1 cell line contains a micro-megachromosome 
( — 50-60 kB), which advantageously contains only one centromere, two 
regions of integrated heterologous DNA with adjacent rDNA sequences, with 
the remainder of the chromosomal DNA being mouse major satellite DNA. 
5 Other truncated megachromosomes can serve as a source of telomeres, or 
telomeres can be provided. The centromere of the mM2C1 cell line contains 
mouse minor satellite DNA, which provides a useful tag for isolation of the 
centromeric DNA. 

Additional features of particular ACs provided herein, such as the 

10 micro-megachromosome of the mM2C1 cell line, that make them uniquely 
suited to serve as starting materials in the isolation and identification of 
chromosomal components include the fact that the centromeres of each 
megachromosome within a single specific cell line are identical. The ability 
to begin with a homogeneous centromere source (as opposed to a mixture of 

15 different chromosomes having differing centromeric sequences) greatly 
facilitates the cloning of the centromere DNA. By digesting purified 
megachromosomes, particularly truncated megachromosomes, such as the 
micro-megachromosome, with appropriate restriction endonucleases and 
cloning the fragments into commercially available and well known YAC 

20 vectors (see, e.g. . Burke et aL (1987) Science 236:806-812), BAC vectors 
(see, e.g. , Shizuya et aL (1992) Proc. Natl. Acad. Sci. U.S.A. 89 : 8794- 
8797 bacterial artificial chromosomes which have a capacity of incorporating 
0.9 - 1 Mb of DNA) or PAC vectors (the P1 artificial chromosome vector 
which is a PI plasmid derivative that has a capacity of incorporating 300 kb 

25 of DNA and that is delivered to coh host cells by electroporation rather 
than by bacteriophage packaging; see, e.g. . loannou et aL (1994) Nature 
Genetics 6:84-89; Pierce et aL (1992) Meth. EnzvmoL 216 :549-574: Pierce 
et aL (1992) Proc. Natl. Acad. Sci. U.S.A. 89:2056-2060; U.S. Patent No. 
5,300,431 and International PCT application No. WO 92/14819) vectors, it 
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plant satellite DNA, the heterologous DNA and/or rDNA, may be used to 
identify and eliminate the non-centromeric DNA-containing clones. 

Additionally, centromere cloning methods described herein may be 
utilized to isolate the centromere-containing sequence of the 
5 megachromosome. 

Once the centromere fragment has been isolated, it may be sequenced 
and the sequence information may in turn be used in PCR amplification of 
centromere sequences from megachromosomes or other sources of 
centromeres. Isolated centromeres may also be tested for function in vivo by 

10 transferring the DNA into a host cell. Functional analysis may include, for 
example, examining the ability of the centromere sequence to bind 
centromere-binding proteins. The cloned centromere will be transferred to 
cells with a selectable marker gene and the binding of a centromere-specific 
protein, such as anti-centromere antibodies ( e.g. , LU851, see, Hadlaczky et 

15 aL (1986) Exp. Cell Res. 167 :1-15) can be used to assess function of the 
centromeres. 

b. Telomeres 

Telomeres that may be used in assembly of an artificial chromosome 
include a 1 kB synthetic telomere (see, e.g., PCT Application Publication No. 

20 WO 97/40183). A double synthetic telomere construct, which contains a 1 
kB synthetic telomere linked to a dominant selectable marker gene that 
continues in an inverted orientation may be used for ease of manipulation. 
Such a double construct contains a series of TTAGGG repeats 3' of the 
marker gene and a series of repeats of the inverted sequence, i.e., GGGATT, 

25 5' of the marker gene as follows: 

(GGGATTT) n — dominant marker gene — (TTAGGG) n . Using an inverted 
marker provides an easy means for insertion, such as by blunt end ligation, 
since only properly oriented fragments will be selected. 

Telomere sequences also include sequences described in plants, for 

30 example, an Arabidopsis sequence containing head-to-tail arrays of the 
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monomer repeat CCCTAAA totaling a few, for example 3-4, kb in length. 
Telomere sequences vary in length and do not appear to have a strict length 
requirement. An example of a cloned telomere is found in GenBank 
accession no. M20158 (Richards and Ausubel (1988) Cell 53:127-136) and 
5 in U.S. Patent No. 5,270,201. Yeast telomere sequences include those 
provided in GenBank accession no. S70807 (Louis et al. (1994) Yeast 
70:271-274). Additionally, a method for isolating a higher eukaryotic 
telomere from A. thai/ana has been reported (Richards and Ausubel (1988) 
Cell 53:127-136; and U.S. Patent No. 5,270,201). 

10 c. Megareplicator 

The megareplicator sequences, such as those containing rDNA, 
provided herein are preferred for use in artificial chromosomes generated by 
assembly of component elements in vitro. The rDNA provides an origin of 
replication and also provides sequences that facilitate amplification of the 

15 artificial chromosome in vivo to increase the size of the chromosome to, for 
example, accommodate increasing copies of a heterologous gene of interest 
as well as continuous high levels of expression of the heterologous genes, 
d. Filler heterochromatin 
Filler heterochromatin, particularly satellite DNA, is included to 

20 maintain structural integrity and stability of the artificial chromosome and 
provide a structural base for carrying genes within the chromosome. The 
satellite DNA is typically A/T-rich DNA sequence, such as mouse major 
satellite DNA, or G/C-rich DNA sequence, such as hamster natural satellite 
DNA. Sources of such DNA include any eukaryotic organisms that carry 

25 non-coding satellite DNA with sufficient A/T or G/C composition to promote 
ready separation by sequence, such as by FACS, or by density gradients. 
Examples of plant satellite DNA include, but are not limited to, satellite DNA 
of soybean (see, e.g., Morgante et al. (1997) Chromosome Res. 5:363-373; 
and Vahedian et al. (1995) Plant MoL Biol. 2S:857-862), satellite DNA on 

30 the rye B chromosome (see, e.g., Langdon et al. (2000) Genetics 154:869- 
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884) and satellite DNA in the Saccharum complex (see, e.g., Alix era/. 
(1998) Genome 4 7:854-864). The satellite DNA may also be synthesized by 
generating sequence containing monotone, tandem repeats of highly A/T- or 
G/C-rich DNA units. 
5 The most suitable amount of filler heterochromatin for use in 

construction of the artificial chromosome may be empirically determined by, 
for example, including segments of various lengths, increasing in size, in the 
construction process. Fragments that are too small to be suitable for use will 
not provide for a functional chromosome, which may be evaluated in cell- 

10 based expression studies, or will result in a chromosome of limited functional 
lifetime or mitotic and structural stability. 

e. Selectable marker 
Any convenient selectable marker, including specific examples 
described herein, may be used and at any convenient locus in the expression 

15 system. 

2. Combination of the isolated chromosomal elements 

Once the isolated elements are obtained, they may be combined to 
generate the complete, functional artificial chromosome expression system. 
This assembly can be accomplished for example, by in vitro ligation either in 

20 solution, LMP agarose or on microbeads. The ligation is conducted so that 
one end of the centromere is directly joined to a telomere. The other end of 
the centromere, which serves as the gene-carrying chromosome arm, is built 
up from a combination of satellite DNA and megareplicator sequences, e.g. , 
rDNA sequence, and may also contain a selectable marker gene. Another 

25 telomere is joined to the end of the gene-carrying chromosome arm. The 

gene-carrying arm is the site at which any heterologous genes of interest, for 
example, in expression of desired proteins encoded thereby, are incorporated 
either during in vitro assembly of the chromosome or sometime thereafter. 
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3. Analysis and testing of the artificial chromosome expression 
systems 

Artificial chromosomes assembled in vitro may be tested for 
functionality in cell systems, such as plant and animal cells, using any of the 
5 methods described herein for the artificial chromosomes, minichromosomes, 
or known to those of skill in the art. 

4. Introduction of desired heterologous DNA into the in vitro 
assembled chromosome 

Heterologous DNA may be introduced into the in vitro synthesized 

10 chromosome using routine methods of molecular biology, may be introduced 
using the methods described herein for the artificial chromosomes, or may be 
incorporated into the in vitro assembled chromosome as part of one of the 
synthetic elements, such as the heterochromatin. The heterologous DNA 
may be linked to a selected repeated fragment, and then the resulting 

1 5 construct may be amplified in vitro using the methods for such in vitro 
amplification provided herein. 

In a particular embodiment of these in vitro assembly methods, a site- 
specific recombination site is included in the assembly DNA or is added into 
the assembled chromosome, such as a plant in vitro assemble artificial 

20 chromosome, after initial assembly. The presence of a recombination site in 
the in vitro assembled artificial chromosome facilitates recombinase-catalyzed 
introduction of heterologous nucleic acid into the chromosome if the 
heterologous nucleic acid also contains a complementary recombination site. 
Such recombination systems include, but are not limited to, Cre//ox [see, 

25 e.g., Dale and Ow (1995) Gene 37:79-85], FLP/FRT [see, e.g., Nigel eta/. 
(1995) The Plant Journal 5:637-652], R//?S [see, e.g., Onouchi et ai. (1991) 
Nuc. Acids Res. 73:6373-6378], G\n/gix [see, e.g.. Maeser and Kahman 
(1991) Mol. Gen. Genet. 230:170-176] and int/aff. The introduction of att 
recombination sites into a chromosome and the use of lambda phage 

30 integrase recombinase in conjunction therewith to permit engineering of 

natural and artificial chromosomes is desribed in copending U.S. provisional 



WO 2002/096923 



PCT/US2002/017451 



-99- 

application Serial No. 60/294,758, by Perkins et al. entitled 
"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2001, U.S. 
provisional application Serial No. 60/366,891, by Perkins et al. entitled 
"CHROMOSOME-BASED PLATFORMS" filed on March 21, 2002, U.S. patent 
5 application Serial No. , by Perkins et al. entitled "CHROMOSOME- 
BASED PLATFORMS" filed on May 30, 2002, under attorney docket no. 

24601-420, and PCT International Application No. , by Perkins et al. 

entitled "CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, 
under attorney docket no. 24601 -420PC, each of which is incorporated 
10 herein in its entirety by reference thereto. Thus, also contemplated herein 
are in vitro assembled artificial chromosomes, in particular such 
chromosomes containing plant chromosome-derived components, that 
contain one or more recombination sites, such as an att site. 

E. Methods for the Production of Plant Acrocentric Chromosomes and 
1 5 Plant Chromosomes Containing Adjacent Regions of rDNA and 

Heterochromatin 

Acrocentric human and mouse chromosomes in which the short arm 
contains only pericentric heterochromatin, an rDNA array, and telomeres can 
be used in the de novo formation of a satellite DNA based artificial 

20 chromosome (SATAC, also referred to as ACes). In some embodiments of 
the methods of producing a plant artificial chromosome provided herein, it 
may be desirable to introduce heterologous nucleic acids into a plant 
chromosome with arms of unequal length (e.g., into the short arm of an 
acrocentric chromosome) and/or containing adjacent regions of rDNA and 

25 heterochromatin, such as pericentric heterochromatin or satellite DNA. Of 
particular interest in such methods are plant acrocentric chromosomes that 
contain rDNA located adjacent to the pericentric heterochromatin or satellite 
DNA, and, in particular, on the short arm of the chromosome with little to no 
euchromatic DNA between the rDNA and the pericentric heterochromatin. 

30 Utilizing such structures as the initial composition in the generation of plant 
artificial chromosomes may facilitate generation of plant artificial 
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chromosomes that are predominantly heterochromatic. For example, 
introduction of heterologous nucleic acid into a cell containing such an 
acrocentric plant chromosome such that the nucleic acid integrates into the 
pericentric heterochromatin and/or rDNA of the short arm of the chromosome 
5 may be associated with amplification (possibly through "megareplicator" 
DNA sequences such as may reside in plant rDNA arrays, also known as the 
nucleolar organizing regions (NOR)) of heterochromatin that leads to the 
formation of a predominantly heterochromatic plant artificial chromosome. 
Naturally occurring acrocentric plant chromosomes are limited in 

10 number, and plant chromosomes with a structure that includes adjacent 
regions of heterochromatin and rDNA may not exist or may not exist for a 
variety of plant species. Provided herein are methods for generating 
acrocentric plant chromosomes and plant chromosomes containing adjacent 
regions of rDNA^and heterochromatin, in particular, pericentric and/or 

15 satellite heterochromatin. Further provided herein are methods for generating 
acrocentric plant chromosomes containing adjacent regions of 
heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
and rDNA on the short arm of the chromosome. 

Also provided herein are plant acrocentric chromosomes in which the 

20 nucleic acid of one or both arms of the chromosome contains less than about 
50%, or less than about 40%, or less than about 30%, or less than about 
20%, or less than about 10%, or less than about 5%, or less than about 
2%, or less than about 1%, or less than about 0.5% or less than about 
0.1 % euchromatin. In some embodiments of these chromosomes, the 

25 nucleic acid of only one arm, either the short arm or the long arm, contains 
less than these specified amounts of euchromatin. In a particular 
embodiment of these chromosomes, the nucleic acid of the short arm 
contains less these specified amounts of euchromatin. 

Further provided herein are plant chromosomes containing adjacent 

30 regions of heterochromatin, in particular pericentric heterochromatin or 
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satellite DNA, and rDNA with little to no euchromatin between the two 
regions. With reference to such plant chromosomes, "litte to no" means that 
the amount of euchromatic DNA, if any, located between the rDNA and 
heterochromatin (such as pericentric heterochromatin and/or satellite DNA), 
5 generally does not stain diffusely and recognizably as euchromatin and/or 
does not contain protein-encoding genes. Thus, in these chromosomes, 
between the heterochromatin (such as pericentric heterochromatin and/or 
satellite DNA) and the rDNA, there is substantially no chromatin that is less 
condensed than the heterochromatin {e.g., pericentric heterochromatin). The 

10 plant chromosomes containing adjacent regions of rDNA and 

heterochromatin (such as pericentric heterochromatin) provided herein may 
be acrocentric chromosomes. In a particular embodiment of these plant 
chromosomes, the adjacent regions of rDNA and heterochromatin, in 
particular pericentric heterochromatin, are contained on the short arm of the 

15 chromosome. 

Further provided are methods of utilizing such plant chromosomes in 
the generation of plant artificial chromosomes, and, in particular, 
predominantly heterochromatic plant artificial chromosomes, such as ACes 
(also referred to as SATACs). In particular methods of producing plant 

20 artificial chromosomes provided herein, nucleic acids are introduced into a 
cell containing a plant chromosome that is acrocentric and/or contains 
adjacent regions of rDNA and heterochromatin, such as pericentric 
heterochromatin, the cells are cultured through at least one cell division and 
a cell comprising an artificial chromosome, such as a predominantly 

25 heterochromatic artificial chromosome, is selected. In these methods, the 
plant chromosome into which nucleic acid is introduced may be an 
acrocentric chromosome containing adjacent regions of rDNA and 
heterochromatin on the short or long arm, and, in particular, on the short 
arm. 
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The plant chromosomes provided herein can be generated using site- 
specific recombination between plant chromosome regions. The regions may 
be on the same chromosome or separate chromosomes. Through site- 
specific recombination, sections of plant chromosomes may be altered to 
5 remove, invert and/or insert sequences such that a desired plant 

chromosome results. The resulting plant chromosome is acrocentric and/or 
contains adjacent regions of heterochromatic DNA and rDNA, which may or 
may not be on the short arm of an acrocentric chromosome. Thus, the 
starting chromosome in these methods may be a plant chromosome or may 

10 be a plant acrocentric chromosome that does not contain adjacent regions of 
rDNA and heterochromatin, such as pericentric heterochromatin or satellite 
DNA. If the starting chromosome is acrocentric, then it may be used in the 
generation of a plant acrocentric chromosome that contains adjacent regions 
of heterochromatic DNA (e.g., pericentric heterochromatin and/or satellite 

15 DNA) and rDNA, particularly on the short arm of the chromosome, or to 

generate a plant acrocentric chromosome in which the nucleic acid of one or 
both arms contains less than about 50%, or less than about 40%, or less 
than about 30%, or less than about 20%, or less than about 10%, or less 
than about 5%, or less than about 2%, or less than about 1 %, or less than 

20 about 0.5% or less than about 0.1% euchromatin. 

In one of the methods provided herein for producing a plant 
chromosome that is acrocentric and/or contains adjacent regions of rDNA 
and heterochromatin, nucleic acid containing a site-specific recombination 
site and nucleic acid containing a complementary site-specific recombination 

25 site are introduced into a cell containing one or more plant chromosomes. 
The nucleic acids may be introduced into the cell sequentially or 
simultaneously. The nucleic acids may also be targeted to particular 
chromosomes and/or particular sequences of a chromosome. Such targeting 
may be accomplished by including in the nucleic acids sequences 

30 homologous to particular sequences in the chromosome(s). 
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The cell is then exposed to a recombinase activity. The recombinase 
activity can be provided by introduction of nucleic acid encoding the activity 
into the cell for expression of the activity therein, or may be added to the cell 
from an exogenous source. The recombinase activity is one that catalyzes 
5 recombination between sequences at the two recombination sites. An 
appropriate recombination event produces a plant chromosome that is 
acrocentric and/or contains adjacent regions of rDNA and heterochromatin 
(such as pericentric heterochromatin and/or satellite DNA) which may be 
readily identified therein based on its particular structure {e.g., arms of 

10 unequal length if the chromosome is acrocentric) and/or other features, e.g., 
the presence of particular added sequences, such as recombination sites and 
DNA encoding a selectable marker, the absence of particular sequences, 
such as excised euchromatic DNA, and the arrangement of sequences, such 
as the placement of rDNA segments adjacent to pericentric heterochromatin 

15 and/or satellite DNA. Such attributes may be detected using techniques 

known in the art for the analysis of nucleic acids and chromosomes, such as, 
for example, in situ hybridization. 

A number of site-specific recombination systems may be used in the 
production of plant chromosomes that are acrocentric and/or contain rDNA 

20 adjacent to heterochromatin, such as pericentric heterochromatin, as 

described herein. Such systems include, but are not limited to, Cre/iox [see, 
e.g., Dale and Ow (1995) Gene 9 7:79-85], FLP/FRT [see, e.g., Nigel et at. 
(1995) The Plant Journal 5:637-652], RIRS [see, e.g., Onouchi et al. (1991) 
Nuc. Acids Res. 7S:6373-6378], Gin/y/x [see, e.g., Maeser and Kahman 

25 (1991) Mol. Gen. Genet. 230:170-176] and int/aff. The introduction of att 
recombination sites into a chromosome and the use of lambda phage 
integrase recombinase in conjunction therewith to permit engineering of 
natural chromosomes is desribed in copending U.S. provisional application 
Serial No. 60/294,758 by Perkins et al. entitled "CHROMOSOME-BASED 

30 PLATFORMS" filed on May 30, 2001, U.S. provisional application Serial No. 
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60/366,891, by Perkins et al. entitled "CHROMOSOME-BASED 
PLATFORMS" filed on March 21, 2002, U.S. patent application Serial No. 

, by Perkins et at. entitled "CHROMOSOME-BASED PLATFORMS" filed 

on May 30, 2002, under attorney docket no. 24601-420, and PCT 

5 International Application No. , by Perkins et at. entitled 

"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, under 
attorney docket no. 24601 -420PC, each of which is incorporated herein in 
its entirety by reference thereto. These systems, as well as others known in 
the art, can be used to specifically excise or invert DNA (for example, in an 

10 intrachromosomal recombination), exchange regions of DNA {for example, in 
an inter-chromosomal recombination) or insert DNA {for example, through 
recombination between homologous sequences at a recombination site and 
the DNA to be inserted). The precise event is controlled by the orientation of 
the recombination site DNA sequences. 

15 In particular embodiments of the methods for producing an acrocentric 

plant chromosome provided herein, nucleic acid containing complementary 
recombinase recognition sites for site-specific recombination is introduced 
into a cell containing one or more plant chromosomes wherein one of the 
sites integrates into, or close to, the pericentric heterochromatin and/or 

20 satellite DNA (in particular, proximal satellite DNA) of one plant chromosome 
in the cell. In a further embodiment, nucleic acid containing complementary 
recombinase recognition sites for site-specific recombination is introduced 
into a cell containing one or more plant chromosomes wherein one of the 
sites integrates into the distal end of an arm of a plant chromosome in the 

25 cell. In these embodiments, recombination between the sites in the presence 
of a recombinase that recognizes the sites can result in deletion of a portion 
of an arm of a chromosome, reciprocal translocation between a distal portion 
of a chromosome arm and a more proximal portion of another chromosome 
arm or reciprocal translocation between pericentric heterochromatin and/or 

30 satellite DNA of one chromosomal arm and a more distal portion of another 
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chromosome arm. Each of these recombination events can serve to reduce 
the length of a chromosome arm and give rise to an acrocentric 
chromosome. 

In another embodiment, a nucleic acid,containing a site-specific 
5 recombination site is introduced into a cell containing plant chromosomes 
wherein it integrates into the pericentric heterochromatin and/or satellite 
DNA of one plant chromosome in the cell and nucleic acid containing a 
complementary site-specific recombination site is introduced into the cell 
wherein it integrates into the distal end of an arm of another plant 
10 chromosome in the cell. In this embodiment, recombination between the 

sites in the presence of a recombinase that recognizes the sites can result in 
reciprocal translocation between the pericentric heterochromatin and/or 
satellite DNA of one chromosome and the distal portion of another 
chromosome arm thereby bringing these two regions into close proximity on 
15 one chromosomal arm and reducing the amount of DNA between the 
pericentric region of the arm and the end of the arm to generate an 
acrocentric plant chromosome. 

These methods for producing an acrocentric plant chromosome may 
also be conducted such that nucleic acid containing a site-specific 
20 recombination site is introduced into a cell containing a plant chromosome 
wherein it integrates into, or close to, the pericentric heterochromatin and/or 
satellite DNA of a plant chromosome in the cell and nucleic acid containing a 
complementary site-specific recombination site is introduced into the cell 
wherein it integrates into the distal end of the same arm of the same 
25 chromosome. In this embodiment, recombination between the sites in direct 
(i.e., the same, or head-to-tail) orientation in the presence of a recombinase 
that recognizes the sites can result in intrachromosomal recombination 
between the pericentric heterochromatin (and/or satellite DNA) and the distal 
portion of the chromosomal arm thereby excising DNA between these two 
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regions and reducing the amount of DNA between them to generate an 
acrocentric plant chromosome. 

In particular embodiments of the methods provided herein for 
producing a plant chromosome containing adjacent regions of rDNA and 
5 heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
nucleic acid containing complementary recombinase recognition sites for site- 
specific recombination is introduced into a cell containing one or more plant 
chromosomes wherein one of the sites integrates into heterochromatin of 
one plant chromosome in the cell. In a further embodiment, nucleic acid 

10 containing complementary recombinase recognitions sites for site-specific 
recombination is introduced into a cell containing one or more plant 
chromosomes wherein one of the sites integrates into rDNA or a nucleolar 
organizing region (NOR) of a plant chromosome in the cell. In these 
embodiments, recombination between the sites in the presence of a 

15 recombinase that recognizes the sites can result in deletion of DNA between 
a heterochromatic region, such as the pericentric heterochromatin (and/or 
satellite DNA), and rDNA, inversion of DNA that includes heterochromatin or 
rDNA of a plant chromosome or reciprocal translocation between 
heterochromatin of one chromosomal arm and rDNA of another chromosomal 

20 arm. Each of these recombination events can serve to arrange chromosomal 
DNA such that a region of heterochromatic DNA, such as pericentric 
heterochromatin and/or satellite DNA, is adjacent to a region of rDNA on a 
plant chromosome. 

In another embodiment, nucleic acid containing a site-specific 

25 recombination site is introduced into a cell containing plant chromosomes 
wherein it integrates into heterochromatin, such as, for example, pericentric 
heterochromatin and/or satellite DNA, of one plant chromosome in the cell 
and nucleic acid containing containing a complementary site-specific 
recombination site is introduced into the cell wherein it integrates into rDNA 

30 of another plant chromosome in the cell. In this embodiment, recombination 
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between the sites can result in reciprocal translocation between the 
heterochromatin of one chromosome and the rDNA of another chromosome 
thereby bringing these two regions into close proximity on one plant 
chromosome with little to no euchromatin between them. 
5 These methods for producing a plant chromosome containing adjacent 

regions of heterochromatic DNA and rDNA may also be conducted such that 
nucleic acid containing site-specific recombination sites is introduced into a 
cell containing a plant chromosome wherein it integrates into 
heterochromatin, for example, pericentric heterochromatin and/or satellite 

10 DNA, of a plant chromosome and nucleic acid containing a complementary 
site-specific recombination site is introduced into the cell wherein it 
integrates into rDNA of the same chromosome. In this embodiment, 
recombination between the sites in direct orientation in the presence of a 
recombinase that recognizes the sites can result in intrachromosomal 

15 recombination between heterochromatin, such as pericentric heterochromatin 
(and/or satellite DNA), and rDNA thereby excising DNA, including 
euchromatic DNA, between these two regions. Recombination of the sites in 
indirect (i.e., head-to-head) orientation in the presence of a recombinase can 
result in inversion of DNA between the sites thereby replacing DNA, such as 

20 euchromatin, located between pericentric heterochromatin (and/or satellite 
DNA) and rDNA on the chromosome with rDNA. Thus, in the resulting plant 
chromosome, rDNA is located adjacent to pericentric heterochromatin (and/or 
satellite DNA), and DNA that was present between the pericentric 
heterochromatin (and/or satellite DNA) and the rDNA is located distal to the 

25 rDNA in a position previously occupied by the rDNA. 

In particular embodiments for producing an acrocentric plant 
chromosome containing adjacent regions of heterochromatin, such as 
pericentric heterochromatin (and/or satellite DNA), and rDNA, the short arm 
of the acrocentric chromosome may be generated in the same recombination 

30 event that places the heterochromatin and rDNA regions adjacent to each 
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other or in a separate recombination event. For example, nucleic acid 
containing a site-specific recombination site may be introduced into a cell 
containing one or more plant chromosomes wherein it integrates into the 
pericentric heterochromatin of one plant chromosome and nucleic acid 
5 containing a complementary site-specific recombination site may be 

introduced into the cell wherein it integrates into rDNA that is located at a 
distal portion of another plant chromosome or the same arm of the same of 
the same chromosome. Recombination of the sites in the presence of a 
recombinase can result in intra- or inter-chromosomal recombination that not 

10 only brings the pericentric heterchromatin (and/or satellite DNA) and rDNA 
into close proximity on one chromosomal arm, but also sufficiently reduces 
the length of that arm such that the resulting chromosome is acrocentric. 

If a single recombination event such as this does not generate an 
acrocentric plant chromosome, multiple recombination events may be used to 

15 produce an acrocentric plant chromosome containing adjacent regions of 

heterochromatic DNA and rDNA. For example, nucleic acid containing a site- 
specific recombination site may be introduced into a cell containing one or 
more plant chromosomes wherein it integrates into the pericentric 
heterochromatin (and/or satellite DNA) of one plant chromosome and nucleic 

20 acid containing a complementary site-specific recombination site may be 
introduced into the cell wherein it integrates into rDNA of the same or a 
different plant chromosome. As described abouve, recombination between 
the sites in the presence of a recombinase can result in deletion, inversion or 
reciprocal translocation of DNA to arrange chromosomal DNA such that 

25 pericentric heterochromatin (and/or satellite DNA) is adjacent to a region of 
rDNA on a plant chromosome. In order to reduce the length of the arm of 
the chromosome on which the adjacent regions of heterochromatin and rDNA 
are located, an additional recombination event can be induced by introducing 
nucleic acid containing a site-specific recombination site into a cell containing 

30 this plant chromosome wherein it integrates into a region of the chromosome 
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distal to the rDNA and nucleic acid containing a complementary site-specific 
recombination site into the cell wherein it integrates into the distal end of the 
same chromosome arm or of another plant chromosome arm. Recombination 
between the recognition sites can result in deletion or reciprocal translocation 
5 of DNA to reduce the length of the chromosome arm distal to the rDNA and 
give rise to an acrocentric plant chromosome containing adjacent regions of 
heterochromatin and rDNA on the short arm of the chromosome. 

In each of the aforementioned methods for producing a plant 
chromosome that is acrocentric and/or contains adjacent regions of 

10 heterochromatin and rDNA, the nucleic acid containing the two or more 

recombination sites may be introduced simultaneously or sequentially into a 
cell or cells using nucleic acid transfer methods described herein or known in 
the art. The nucleic acids may randomly integrate into plant chromosomes or 
may be targeted for integration into a particular region or site on a plant 

15 chromosome through homologous recombination between sequences in the 
nucleic acid and sequences within the chromosome. The recombinase 
activity may be provided by introduction of nucleic acid encoding an 
appropriate recombinase into the cell for expression therein. The 
recombinase-encoding nucleic acid may be introduced into the cell prior to, 

20 during or after introduction of nucleic acids encoding recombination sites. 

To facilitate identification of cells containing the transferred nucleic 
acids and/or in which a recombination event has occurred, nucleic acid 
encoding a selectable marker may be introduced into the cell. For example, 
one or both of the nucleic acids containing a recombination site may also 

25 contain DNA encoding a selectable marker {e.g., a resistance-encoding 
marker or a reporter molecule) operatively linked to a promoter which is 
oriented such that integration of the nucleic acid into a chromosome places 
the marker DNA between two directly oriented recombination sites on an arm 
of a chromosome. A cell containing the nucleic acid will thus be resistant to 

30 a selection agent or will detectably express a reporter molecule. Exposure of 
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the cell to the appropriate recombinase can result in a recombination event 
that excises the DNA between the two recombination sites, which includes 
DNA encoding the selectable marker. Thus, recombination could be detected 
as loss of reporter molecule expression or decreased resistance to a selection 
5 agent. After exposure to a recombinase, the cells into which nucleic 

acids containing recombination sites have been transferred may be analyzed 
for the presence of acrocentric plant chromosomes using, for example, FISH 
analysis and other chromosome visualization techniques. 

In another method provided herein for producing a plant chromosome 

10 that is acrocentric and/or contains adjacent regions of heterchromatin and 
rDNA, the recombination event or events that lead to formation of the 
chromosome occur through crossing of transgenic plants that contain 
chromosomes which contain complementary site-specific recombination 
sites. Thus, in one embodiment of these methods, nucleic acid containing a 

15 recombination site adjacent to nucleic acid encoding a selectable marker is 
introduced into a first plant cell and a first transgenic plant is generated from 
the first plant cell. Nucleic acid containing a promoter functional in a plant 
cell, a recombination site and a recombinase coding region in operative 
linkage is introduced into a second plant cell from which a second transgenic 

20 plant is generated. The first and second transgenic plants are crossed to 
obtain one or more plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker, and a resistant 
plant that contains cells comprising a plant chromosome that is acrocentric 
and/or contains adjacent regions of heterochromatin and rDNA is selected. 

25 In an example of this method, nucleic acids containing site-specific 

recombination sites are introduced into cells of Nicotiana tabacum. The 
nucleic acids are introduced separately by infecting leaf explants with 
Agrobacterium tumefaciens which carries the kanamycin-resistance gene 
(Kan R ). Kanamycin-resistant transgenic plants are generated from the 

30 infected leaf explants. One transgenic plant contains nucleic acid encoding a 
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promoterless hygromycin-resistance gene preceded by a /ox-site specific 
recombination sequence (lox-hpt), the other plant contains a cauliflower 
mosaic virus 35S promoter linked to a lox sequence and the ere DNA 
recombinase coding region {35S-/ox-cre) . The resultant Kan R transgenic 
5 plants are crossed (see, e.g., protocols of Qin et aL (1994) Proc. Natl. Acad. 
ScL U.S.A. 706-1 710, 1994). Plants in which the appropriate DNA 
recombination event has occurred are identified by hygromycin-resistance. 

The Kan R cultivars initially may be screened, such as by FISH, to 
identify two sets of candidate transgenic plants. One set has one construct 

10 integrated in regions adjacent to the pericentric heterochromatin (and/or 
satellite DNA) on the short arm of any chromosome. The second set of 
candidate plants has the other construct integrated in rDNA, such as the 
NOR region, of appropriate chromosomes. To obtain reciprocal translocation 
both sites must be in the same orientation. Therefore a series of crosses 

15 may be required, marker-resistant plants generated, and FISH analyses 

performed to identify an "acrocentric" plant chromosome or chromosomes 
that contain adjacent regions of heterochromatin. As described above, such 
an acrocentric chromosome may be used for de novo plant artificial 
chromosome formation, particularly predominantly heterochromatic plant 

20 artificial chromosomes. The selection of appropriate plant lines can be done, 

for example, using marker-assisted selection. 

F. Incorporation of Heterologous Nucleic Acids into Artificial 
Chromosomes 

Heterologous nucleic acids can be introduced into artificial 
25 chromosomes during or after formation. Incorporation of particular desired 
nucleic acids into an artificial chromosome during generation thereof may be 
accomplished by including the desired nucleic acids along with the nucleic 
acid encoding a selectable marker and any other nucleic acids used in 
artificial chromosome generation (e.g., targeting sequences that direct the 
30 heterologous nucleic acid to the pericentric region of a chromosome) in the 
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transformation of a cell to initiate amplification and formation of a artificial 
chromosomes. 

Alternatively/ heterologous nucleic acids may be incorporated into an* 
artificial chromosome following formation thereof through transfection of a 
5 cell containing the artificial chromosome with the heterologous nucleic acids. 
In general, incorporation of such nucleic acids into the artificial chromosome 
is assured through site-directed integration, such as may be accomplished by 
including nucleic acids homologous or identical to DNA contained within the 
artificial chromosome in with the heterologous nucleic acid when transferring 
10 it to the artificial chromosome. An additional selective marker gene may also 
be included. 

Additionally, introduction of nucleic acids, particularly DNA molecules 
to an artificial chromosome can be accomplished by the use of site-specific 
recombinases as described herein (see, also, copending U.S. provisional 

15 application Serial No. 60/294,758 by Perkins et al. entitled "CHROMOSOME- 
BASED PLATFORMS" filed on May 30, 2001, U.S. provisional application 
Serial No. 60/366,891, by Perkins et aL entitled "CHROMOSOME-BASED 
PLATFORMS" filed on March 21, 2002, U.S. patent application Serial No. 
, by Perkins et al. entitled "CHROMOSOME-BASED PLATFORMS" filed 

20 on May 30, 2002, under attorney docket no. 24601-420, and PCT 

International Application No. , by Perkins et aL entitled 

"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, under 
attorney docket no. 24601 -420PC; each of which is incorporated in its 
entirety by reference thereto). Artificial chromosomes can be produced 

25 containing recombinase recognition sequences, to allow the site-specific 

introduction of DNA molecules into the same. Another use for an introduced 
recombinase site is to provide a region for site-specific integration of a new 
trait by the use of recombinase mediated gene insertion. 
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G. Introduction of Artificial Chromosomes into Plant Cells and Recovery 
of Plants Containing Artificial Chromosomes 

Artificial chromosomes can be introduced into plant cells by a variety 
of methods familiar to those skilled in the art. These methods include 
5 chemical and physical methods for introduction of foreign DNA, as well as 
cell culture methods to transfer chromosomes from one cell to another cell. 

Any type of artificial chromosome can be used. Plant artificial 
chromosomes (PACs) can be prepared by the in vivo and in vitro methods 
described herein. PACs can be prepared inside plant protoplasts and then 

10 transferred to other plant species and tissues, in particular to other plant 

protoplasts, via fusion in the presence or absence of PEG as described herein 
(Draper et ai. (1982) Plant Cell Physiol. 23:451-458; Krens et al. (1982) 
Nature 72-74). PACs can be isolated from the protoplasts in which they 
were prepared, encapsulated into liposomes, and delivered to other plant 

15 protoplasts (Deshayes et al. (1985) EMBO J. 4:2731-2737). Alternatively, 
the PACs can be isolated and delivered directly to plant protoplasts, plant 
cells, or other plant targets via a PEG-mediated process, calcium phosphate- 
mediated process, electroporation, microinjection, (particle bombardment), 
lipid-mediated method with or without sonoporation, sonoporation alone, or 

20 any method known in the art as described herein (Haim et al. (1985) Mol. 

Gen. Genet. 199:161-168; Fromm era/. (1986) Nature 319:791-793; Fromm 
etaL (1985) Proc. Nat. Acad. Sci. USA 82:5824-5828; Klein et al. (1987) 
Nature 327:70; Klein etaL (1988) Proc. Nat. Acad. Sci. USA 85:8502-8505; 
and International PCT application publication no. WO 91/00358). Plant 

25 artificial chromosomes can also be transferred to other plant species by 
preparation of protoplast-derived plant microcells, and fusion of the 
microcells containing the plant artificial chromosome with plant cells of other 
plant species. 

Mammalian artificial chromosomes (MACs) can be transferred to plant 
30 cells. Mammalian artificial chromosomes are prepared by the in vivo and in 
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vitro methods described in US Patent Nos. 6,025,155 and 6,077,697, and 
International PCT application No. WO 97/40183. MACs can be prepared as 
microcells, and the microcells can be fused with plant protoplasts in the 
presence or absence of PEG {Dudits eta/. (1976) Hereditas 82:121-123; 
5 Wiegland et at. (1987) J. Cell. Sci. Pt. 2 145-149). Alternatively, the MACs 
can be isolated and delivered directly to plant cells, protoplasts, and other 
plant targets using a PEG-mediated process, calcium phosphate-mediated 
process, electroporation, microinjection, lipid-rnediated method with or 
without sonoporation, sonoporation alone, or any method known in the art as 

10 described herein and in US Patent Nos. 6,025,155 and 6,077,697, and 
International PCT application publication No. WO 97/40183. 

After PACs or MACs are introduced into plant targets and the plant 
targets are grown and analyzed for transfection, the plant transformed plant 
targets can be developed using standard conditions into roots, shoots, 

15 plantlets, or any structure capable of growing into a plant. 

Accordingly, methods for the introduction of artificial chromosomes 
represent the first step in the production of plant cells and whole plants 
containing artificial chromosomes from a variety of sources. 

The ability to introduce genes into plants, such that they are stably 

20 expressed and transmissible from generation to generation, has 

revolutionized plant biology and opens up new possibilities for using plants 
as green factories for the production of commercially useful products as well 
as for other applications described herein. There are several approaches to 
the generation of stably transformed plants, and the adopted approach varies 

25 according to the aims of the project. For introduction of artificial 
chromosomes into plants, a variety of methods may be employed, 
transgenic plants, the transformation process involves the methods of foreign 
DNA delivery to plant host cells, the growth and analysis of transformed 
plant host cells, and the generation and regeneration of transgenic plants 

30 from transformed plant host cells. 
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1 . Introduction of artificial chromosomes into plant host cells 
Numerous methods for producing or developing transgenic plants are 
available to those of skill in the art. The method used is primarily a function 
of the species of plant. Artificial chromosomes containing heterologous 
5 DNA, such as artificial chromosomes prepared by the methods described 
herein, can be introduced into plant host cells, including, but not limited to, 
plant cells and protoplasts, by, for example, non-vector mediated DNA 
transfer processes (see, also copending U.S. application Serial No. 
09/815,979, which describes methods for delivery that can be adapted for 

10 use with plant cells and used with plant protoplasts). 

Non-vector mediated, or direct, gene transfer systems involve the 
introduction of heterologous DNA, in particular artificial chromosomes, into 
host cells, including but not limited to plant cells and protoplasts, without the 
use of a biological vector. The artificial chromosome that is introduced into 

15 these plant host cells can lead to the development of transformed, 
regenerable transgenic plants. The direct gene transfer systems for 
transgenic plants are designed to overcome the barrier to DNA uptake 
caused by the cell wall and the plasma membrane of plant cells. The 
approaches for direct gene transfer include, but are not limited to, chemical, 

20 electrical, and physical methods, which can also be adapted to optimize 
transfer of artificial chromosomes (see, e.g. , Uchimiya et aL (1989) J. of 
Biotech. 12: 1-20 for a review of such procedures, see also, e.g. , U.S. 
Patent Nos. 5,436,392; 5,489,520; Potrykus et al. (1985) MoL Gen. Genet 
7SS:183; Lorz eta/. (1985) MoL Gen. Genet. 7SS:178; Fromm et aL (1985) 

25 Proc. Natl. Acad. Set. U.S.A. 32:5824-5828; Uchimiya et aL (1986) MoL 

Gen. Genet. 204:204; Callis et al. (1987) Genes Dev. /:1 183-2000; Callis et 
aL (1987) Nuc. Acids Res. 75:5823-5831; Marcotte et al. (1988) Nature 
355:454 and Toriyama et al. (1988) Bio/Technology 6:1072-1074). 
a. Chemical methods 
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Uptake of artificial chromosomes into plant cells, such as protoplasts, 
can be accomplished in the absence or presence of polyethylene glycol 
(PEG), which is a fusogen, or by any variations of such methods known to 
those of skill in the art [see, e.g. , U.S. Patent No. 4,684,61 1 to Schilperoot 
5 et aL; Paskowski et aL (1984) EMBO J. 3:2717-2722; U.S. Patent Nos. 
5,231,019 and 5,453,367]. In one approach, plant protoplasts are 
incubated with a solution of foreign DNA, in particular artificial 
chromosomes, and PEG at a concentration that allows for high cell survival 
and high efficiency chromosome uptake. The protoplasts are then washed 

10 and cultured [Datta and Datta (1999) Meth. in Molecular Biol. 1 1 1:335-348]. 
In an alternative approach, plant protoplasts are incubated with artificial 
chromosomes in the presence of calcium phosphate for direct artificial 
chromosome uptake (Haim et aL (1985) Mol. Gen. Genet. 199: 161 -168). 
Alternatively, the artificial chromosome, in particular plant artificial 

15 chromosome (PAC), is formed in a plant protoplast which is, in turn, fused 
with another plant protoplast in the presence or absence of PEG to transfer 
the PAC to the plant host protoplast. Such methods for treating protoplasts 
with PEG and foreign DNA are well known in the art (Draper et aL (1982) 
Plant Cell Physiol. 23:451-458; Krens et al. (1982) Nature 72-74). 

20 Another chemical direct gene transfer method involves lipid-mediated 

delivery of artificial chromosomes to plant protoplasts. In this process, 
liposomes with encapsulated artificial chromosomes are allowed to fuse with 
protoplasts alone or in the presence of PEG as the fusogen to transfer the 
foreign DNA, in particular artificial chromosome, to the plant host protoplast 

25 (Deshayes et aL (1985) EMBO J. 4:2731-2737; Fraley and Paphadjopoulos 
(1982) Curr Top Microbiol Immunol 96:171-191). 

Another direct gene transfer method involves the use of microcells. 
The chromosomes can be transferred by preparing microcells containing 
artificial chromosomes and then fusing the microcells with plant protoplasts. 

30 Methods for the preparation and fusion of microcells with other cells are well 
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known in the art {see Example No. 4 and see also, e.g. . U.S. Patent Nos. 
5,240,840; 4,806,476,5,298,429; 5,396,767; Fournier (1981) Proc. Natl. 
Acad. Sci. U.S.A. 78:6349-6353; and Lambert et aL (1 991 ) Proc. Natl. 
Acad. Sci. U.S.A. 88 :5907-59; Dudits et aL (1976) Hereditas 82:121-123; 
5 Wiegland et aL (1987) J. Cell. Sci. Pt. 2 145-149). 
b. Electrical methods 
Electroporation, which involves high-voltage electrical pulses to a solution 
containing a mixture of protoplasts or plant cells and foreign DNA # in 
particular artificial chromosomes, to create nanometer-sized, reversible pores, 

10 is a common method to introduce DNA into plant cells or protoplasts. The 
exogenous DNA may be added to the protoplasts in any form such as, for 
example, naked linear, circular or supercoiled DNA, artificial chromosomes 
encapsulated in liposomes, DNA in spheroplasts, artificial chromosomes in 
other plant protoplasts, artificial chromosomes complexed with salts, and 

15 other methods. The foreign DNA, in particular artificial chromosome, can also 
include a phenotypic marker to identify plant cells that are successfully 
transformed. 

When plant cells or protoplasts are subjected to short electrical DC (direct 
current) pulses, they may experience an increase in the permeability of the 

20 plasma membrane and/or cell wall to hydrophilic molecules such as nucleic 
acids, which are normally unable to enter the plant cell directly. Nucleic 
acids are taken directly into the cell cytoplasm either through these pores or 
as a consequence of the redistribution of membrane components that 
accompanies closure of the pores. Certain cell wall-degrading enzymes, such 

25 as pectin-degrading enzymes, may be employed to render the plant target 
recipient cells more susceptible to DNA or artificial chromosome uptake by 
electroporation than untreated cells. Plant recipient cells may also be 
susceptible to transformation by mechanical wounding. To effect 
transformation by electroporation, friable tissues such as a suspension 

30 culture of cells or embryonic callus may be used or immature embryos or 
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other organized tissues may be directly transformed (see, e.g., Fromm eta/. 
(1986) Nature 3/3:791-793). Methods for effecting electroporation are well 
known in the art (see, e.g. , U.S. Patent Nos. 4,784,737; 4,970,154; 
5,304,486; 5,501,967; 5,501,662; 5,019,034; 5,503,999; see, also Fromm 
5 et aL ( 1 985) Proc. Natl. Acad. Sci. U.S.A. 82:5824-5828; Zimmerman et aL 
(1981) Biophys Biochem Acta 641:160-165; Neuman et aL (1982) EMBO J. 
1:841-845; Riggs et aL (1986) Proc. Nat. Acad. Sci. USA 83:5602-5606; 
Lurquin (1997) Mol. Biotechnol. 7:5-35; Bates (1999) Methods in Molecular 
Biology 1 1 1:359-366). Electroporation can be used to introduce nucleic 

10 acids into tobacco mesophyll cells (Morikawa et aL (1986) Gene 41:121- 
124; leaf bases of rice (Dekeyser et aL (1990) Plant Cell 2:591-602; 
immature maize embryos (Songstad et aL (1993) Plant Cell Tiss. Orgn. Cult. 
40:1-15; macerated immature maize embryos (D'Halluin et aL (1992) Plant 
Cell 4:1495-1505; suspension cultured maize cells (Laursen et aL (1994) 

15 Plant Mol. Biol. 24: 51-61; and sugar cane (Arencibia et aL (1995) Plant Cell 
Rep. 14:305-309). 

Artificial chromosomes may be delivered to plant cells, in particular 
plant seeds, by the use of electroporation and pollen to derive pollen 
comprising an artificial chromosome. Methods that may be used for delivery 

20 of artificial chromosomes into pollen include, for example, techniques 
described in U.S. Patent No. 5,049,500 and by Negrutiu et aL [in 
Biotechnology and Ecology of Pollen, Mulcahy et aL eds., (1986) Springer 
Verlag, N.Y., pp. 65-69] and Fromm et aL [(1986) Nature 319:791; including 
methods for introducing DNA into mature pollen using various procedures 

25 such as heat shock, PEG and electroporation]. The pollen is capable of 
germinating and fertilizing an egg cell, leading to the formation of a plant 
seed comprising an artificial chromosome, 
c. Physical methods 
The physical methods approach for introducing foreign DNA, in 

30 particular artificial chromosomes , into plant cells overcomes the cell wall 
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barrier to DNA movement. Physical, or mechanical means, are used to 
introduce transgenes directly into protoplasts or plant cells and include, but 
are not limited to, microinjection, particle bombardment, and sonoporation. 

(1) Microinjection 

5 Microinjection involves the mechanical injection of heterologous DNA, 

in particular artificial chromosomes, into plant cells, including cultured cells 
and cells in intact plant organs and embryoids in tissue culture via very small 
micropipettes, needles, or syringes (Neuhaus et aL (1987)Theor. Appl Genet. 
75:30-36; Reich et al. (1986) Can. J. Bot. 64:1255-1258; Crossway et aL 

10 (1986) BioTechniques 4:320-334; Crossway et aL (1986) Mol. Gen. Genet. 
20:179; U.S. Patent No. 4,743,548; silicon carbide whiskers (Kaeppler et 
aL (1990) Plant Cell Rep. 9:415-418; Frame et aL (1994). For example, 
microinjection of protoplast cells with foreign DNA for transformation of plant 
cells has been reported for barley and tobacco (see, e.g., Holm et aL (2000) 

15 Transgenic Res. 9:21-32 and Schnorf et aL Transgenic Res. 7:23-30). Single 
artificial chromosomes may be front-loaded into microinjection needles and 
then injected into cells ("pick-and-inject") following procedures as described 
by Co et aL [(2000) Chromosome Res. 8:183-191], 

(2) Particle bombardment 

20 Microprojectile bombardment (acceleration of small high density 

particles, which contain the DNA, to high velocity with a particle gun 
apparatus, which forces the particles to penetrate plant cell walls and 
membranes)have also been used to introduce heterologous DNA into plant 
cells. Microprojectile bombardment techniques for the introduction of nucleic 

25 acids into plant cells, in addition to being an effective means of reproducibly 
stably transforming plant cells, particularly monocots, do not require isolation 
of protoplasts or susceptibility of the host cell to Agrobacterium infection. In 
these methods, nucleic acids are carried through the cell wall and into the 
cytoplasm on the surface of small, typically metal, particles (see, e.g., Klein 

30 etaL (1987) Nature 327:70; Klein et aL (1988) Proc. Natl. Acad. Sci. U.S.A. 
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35:8502-8505, Klein et al. in Progress in Plant Cellular and Molecular 
Biology, eds. Nijkamp, H.J. J., Van der Plas, J.H.W., and Van Aartrijk, J., 
Kluwer Academic Publishers, Dordrecht, (1988), p. 56-66 and McCabe ef al. 
(1988) Bio/Technology 6:923-926; Sautter ef al (1991) Biol. Technol. 
5 9:1080-1085; Gordon-Kamm et al. (1990) Plant Cell 2:603-618; Finer et al. 
(1999) Curr. Top. Microbiol. Immunol. 240:59-80; Vasil and Vasil (1999) 
Methods in Molecular Biology 111:349-358; Seki et al. (1999) Mo. 
Biotechnol. 11:251-255). Particles may be coated with nucleic acids and 
delivered into cells by a propelling force. Exemplary particles include those 

10 containing tungsten, gold or platinum, as well as magnesium sulfate crystals. 
The metal particles can penetrate through several layers of cells and thus 
allow the transformation of cells within tissue explants. 

In an illustrative embodiment (see, e.g., U.S. Patent No. 6,023,013) of 
a method for delivering foreign nucleic acids into plant cells, e.g., maize 

15 cells, by acceleration, a Biolistics Particle Delivery System may be used to 
propel particles coated with DNA or cells through a screen, such as a 
stainless steel or Nytex screen, onto a filter surface covered with plant (e.g., 
corn) cells cultured in suspension. The screen disperses the particles so that 
they are not delivered to the recipient cells in large aggregates. The 

20 intervening screen between the projectile apparatus and the cells to be 

bombarded may reduce the size of projectile aggregates and may contribute 
to a higher frequency of transformation by reducing damage inflicted on the 
recipient cells by projectiles that are too large. 

For the bombardment, cells in suspension may be concentrated on 

25 filters or solid culture medium. Alternatively, immature embryos or other 
plant target cells may be arranged on solid culture medium. The cells to be 
bombarded are typically positioned at an appropriate distance below the 
microprojectile stopping plate. If desired, one or more screens may also be 
positioned between the acceleration device and the cells to be bombarded. 
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The prebombardment culturing conditions and bombardment 
parameters may be optimized to yield the maximum numbers of stable 
transformants. Both the physical and biological parameters for bombardment 
are important in this technology. Physical factors include those that involve 
5 manipulating the DNA/microprojectile precipitate or those that affect the 

flight and velocity of either the macro- or microprojectiles. Biological factors 
include all steps involved in manipulation of cells before and immediately 
after bombardment, the osmotic adjustment of target cells to help alleviate 
the trauma associated with bombardment, and also the nature of the 

10 transforming nucleic acid, such as linearized DNA f intact supercoiled 
plasmids, or artificial chromosomes. 

Physical parameters that may be adjusted include gap distance, flight 
distance, tissue distance and helium pressure. In addition, transformation 
may be optimized by adjusting the osmotic state, tissue hydration and 

15 subculture stage or cell cycle of the recipient cells. Ballistic particle 

acceleration devices are available from Agracetus, Inc. (Madison, Wl) and 
BioRad (Hercules, CA). 

Techniques for transformation of A188-derived maize line using 
particle bombardment are described in Gordon-Kamm et al. (1990) Plant Cell 

20 2:603-618 and Fromm et al. (1990) Biotechnology 3:833-839. 

Transformation of rice may also be accomplished via particle bombardment 
(see, e.g., Christou et al. (1991) Biotechnology 9:957-962). Particle 
bombardment may also be used to transform wheat (see, e.g., Vasil et al. 
(1992) Biotechnology 70:667-674 for transformation of cells of type C long- 

25 term regenerate callus; and Weeks et al. (1993) Plant Physiol. 702:1077- 
1 084 for transformation of wheat using particle bombardment of immature 
embryos and immature embryo-derived callus). The production of transgenic 
barley using bombardment methods is described, for example, by Koprek et 
al. (1996) Plant ScL 7 73:79-91. 

30 (3) Sonoporation 
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Foreign DNA, in paticular artificial chromosomes, may be introduced 
into plant protoplasts using ultrasound treatment, in particular mild 
ultrasound treatment (1O-1O0kHz), to create pores for DNA uptake (see e.g. 
International PCT application publication no. WO 91/00358) or may be 
5 introduced into plant protoplasts via a sonoporation machine (ImaRx 
Pharmaceutical Corp., Tucson, AZ). 

Alternatively, the delivery of artificial chromosomes into plant host 
cells is performed by any method described herein or well known in the art. 
For example, needle-like whiskers (US 5,302,523, 1994, US 5,464,765) 

10 have been used to delivery foreign DNA. 

Suitable plant targets into which foreign DNA, in particular artificial 
chromosomes, is transferred include, but are not limited to, protoplasts, cell 
culture cells, cells in plant tissue, meristem cells, microspores, callus, pollen, 
pollen tubes, microspores, egg-cells, embryo-sacs, zygotes or embryos in 

15 different stages of development, seeds, seedlings, roots, stems, leaves, 
whole plants, algae, or any plant part capable of proliferation and 
regeneration of plants, (see, e.g., U.S. Patent Nos. 5,990,390; 6,037,526 
and 5,990,390). The growth of the transformed plant targets described 
herein can done with tissue-culture or non-tissue culture methods, with the 

20 preferred methods being tissue culture methods. 

All plant cells into which foreign DNA, in particular artificial 
chromosomes, are introduced and that is regenerated from the transformed 
cells are used directly for expressed purposes (e.g. herbicide resistance, 
insect/pest resistance, disease resistance, environmental/stress resistance, 

25 nutrient utilization, male sterility, improved nutritional content, production of 
chemicals or biologicals, non-protein expressing sequences, and preparation 
and screening of libraries) as described herein or are used to produce 
transformed whole plants for the applications and uses described herein. The 
particular protocol and means for the introduction of the artificial 
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chromosome into the plant host is adapted or refined to suit the particular 
plant species or cultivar. 

Chromosomes may be transferred to cells by microcell mediated 
chromosome transfer (MMCT) (Telenius et al., Chromosome Research 7:3-7, 
5 1 999; Ramulu et al., Methods in Molecular Biology 111: 227-242, 1 999). In 
general, donor plant cultures or donor mammalian cell cultures are incubated 
in media supplemented with reagents that inhibit DNA synthesis (e.g., 
hydroxy urea, aphidicolin) and/or reagents that inhibit attachment of 
chromosomes to the mitotic spindle (e.g.,colcemid, colchicines, amiprophos- 

10 methyl, cremart). The cell walls of plant cells are digested with enzymes 
(e.g., cellulase, maceroenzyme) producing protoplasts. Donor plant 
protoplasts or donor mammalian cells are loaded on a Percoll gradient in the 
presence of cytochalasin-B (which causes the cell cytoskeleton to 
depolymerize into monomer protein subunits) and centrifuged at 10 s x g. 

15 During centrifugation the metaphase chromosomes are extruded through the 
plasma membrane forming plant 'microprotoplasts' or mammalian 
'microcells.' The microprotoplasts/microcells are filtered through nylon 
sieves of decreasing pore size (8-3 //m) to isolate smaller ones that contain 
predominately 1 metaphase chromosome. The microprotoplasts/microcells 

20 are fused to recipient plant protoplasts or mammalian cells by polyethelene 
glycol (peg) treatment. The fusion mixture is cultured in appropriate media. 
If the chromosome of interest is expressing a selection marker gene the 
fusion mixtures may be cultured in appropriate media supplemented with the 
appropriate selection drug (e.g. hygromycin, kanamycin). 

25 2. The growth of transformed plant host cells 

In tissue culture methods, plant cells or protoplasts transformed by the 
chemical, physical, electrical methods described herein are grown, or 
cultured, under selective conditions. The selective markers are integrated 
into the heterologous DNA, in particular artificial chromosome, before its 

30 introduction to plant hosts or are integrated into the plant host after 
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transfection. An additional marker can be used for double selection. 
Generally, the plant cells or protoplasts are grown for numerous generations, 
after which the transformed cells are identified. 

The transformed cells are subjected to conditions known in the art for 
5 callus initiation. Tissue that develops during the initiation period is placed in 
a regeneration or selection medium where shoot and root development occur. 
The plantlets are analyzed for the determination of transformation 
(International PCT application publication no. WO 00/60061). In the case of 
maize, embryonic callus cultures are initiated from immature maize embryos, 

10 bombarded with genes, and transformed into plantlets by the methods 

described in International PCT application publication no. WO 00/60061. In 
tissue culture methods. Rice calli are transformed with DNA encoding 
insecticidal proteins CrylA(b) and CrylA(c) for insect resistance. Common 
tissue culture methods can also be used to transform tobacco and tomato 

15 (see, e.g., US Patent No. 6,136,320), embryogenic maize calli (US Pat. Nos. 
5,508,468; 5,538,877; 5,538,880; 5,780,708; 6,013,863; 5,554,798; 
5,990,390; and 5,484,956;) and other crop species, e.g., potato and 
tobacco (Sijmons et al. (1990) Bio/Technol 8:217-221; tobacco 
(Vanderkerckhove et al. (1989) Bio/Technol 7:929-932 and Owen and Pen 

20 eds. Transgenic Plants: A Production System for Industrial and 

Pharmaceutical Proteins, John Wiley & Sons, Chichester, 1 996) and rice 
(Zhu et al. (1994) Plant Cell Tiss Org Cult 36:197-204). 
3. Analysis of transformed plant host cells 

Once foreign DNA, in particular artificial chromosomes, is introduced 
25 into plant hosts and the cells or protoplasts are grown and developed under 
the conditions described herein, the plant cells or protoplasts which were 
transformed with artificial chromosomes are identified. The plant cell, 
protoplast, callus, leaf disc, or other plant target are screened for the 
presence of artificial chromosomes by various methods well known in the art 
30 including, but not limited to, assays for the expression of reporter genes, 
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PCR of the isolated plant chromosomes or DNA, electron microscopy, 
visualization methods, and in situ hybridization of chromosome painting 
probe as described herein. Moreover, cells treated with artificial 
chromosomes are isolated during metaphase using a mitotic arrest agent, 
5 such as colchicine, and the artificial chromosome are distinguished from 
endogenous chromosomes by fluorescence-activated cell sorting, size and 
density differences, or by any method well known in the art. Alternatively, 
when a selectable marker gene is transmitted with or as part of the artificial 
chromosome, selective agents are used to detect the expression of the 

10 selectable marker (International PCT application publication no. WO 

00/60061; US Patent No. 6,136,320; Owen and Pen Eds. Transgenic Plants: 
A Production System for Industrial and Pharmaceutical Proteins). Enzymatic 
assays, immunological assays, bioassays, germination assays, or chemical 
assays are used to assess the phenotypic effects of artificial chromosomes 

15 such as insect or fungal resistance or any other expression of genes in 

artificial chromosomes (Cheng et al. (1998) 95:2767-2772; US Patent No. 
6,126,320; International PCT application publication no. WO 00/60061; 
Owen and Pen eds. Transgenic Plants: A Production System for Industrial 
and Pharmaceutical Proteins, John Wiley & Sons, Chichester, 1996). The 

20 plant cells, protoplasts, or other plant hosts that are successfully transformed 
with artificial chromosomes are used directly to express the gene of interest 
or are used to generate transgenic plants. 

Fluorescent in situ hybridization (FISH) may be used to screen for the 
transfer of artificial chromosomes into plant cells. Using DNA probes specfic 

25 for the artificial chromosome (e.g., mouse major satellite DNA probe for 
murine satellite DNA based artificial chromosomes; or a kanamycin, 
hygromycin or GUS gene DNA probe for a plant artificial chromosome 
carrying such a gene) standard FISH techniques for plant cells have been 
described (de Jong et al., Trends in Plant Science 4: 258-263, 1999). 
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IdU labeling can be used to determine the optimum conditions for 
chromosome transfer (microcells) or isolated artificial chromosomes. The 
incorporated IdU increases the fragility of the chromosome and will increase 
the probability of cellular mutation. Hence, the cells are fixed within 48- 
5 hours after transfection/f usion and analyzed for chromosome uptake using 
various procedures. Once the optimum transfer conditions have been 
determined, long-term expression experiments are performed with unlabeled 
artificial chromosomes or microcells. 
H. Re-generation of transgenic plants 

10 Plants containing artificial chromosomes are generated from plant 

cells, protoplasts, calli, or other plant tissue targets into which foreign DNA, 
in particular artificial chromosomes, have been introduced. Regeneration 
techniques for many commercially important plant species are well-known in 
the art. The artificial chromosome that is inserted into plant hosts to 

15 produce transgenic plants are PACs or MACs. 

Plants are re-generated by the planting of transformed roots, plantlets, 
seeds, seedlings and structures capable of growing into a whole plant 
capable of reproduction (see, e.g., US Patent Nos. 6,136,320 and 
International PCT application No. WO 00/60061). The re-generation of maize 

20 plants from transformed protoplasts is found, for example, in European 
Patent Application nos. 0 292 435 and O 392 225 and International PCT 
Application Publication no. WO 93/07278; the regeneration of rice following 
gene transfer is found in Zhang et al. (1988) Plant Cell Hep. 7:379-384; 
Shimamoto et al. (1989) Nature 535:274-277; Datta et al. (1990) 

25 Biotechnology 5:736-740; and the re-generation of fertile transgenic barley 
by direct DNA transfer to protoplasts is described by Funatsuki et al. (1995) 
Theor. Appl. Genet. S/:707-712. Alternatively, plants containing artificial 
chromosomes are obtained by crossing a plant containing an artificial 
chromosome with another plant to produce plants having an artificial 

30 chromosome in their genomes (see e.g. US Patent No. 6,150,585). 



WO 2002/096923 



PCT7US2002/017451 



-127- 

Plants containing an artificial chromosome are propagated through 
seed, cuttings, or vegetatively. The seed from plants containing an artificial 
chromosome are grown in the field, in pots, indoors, outdoors, in 
greenhouses, on glass, or in or on any suitable medium, and the resulting 
5 sexually mature transgenic plants are self-pollinated to generate true breeding 
plants. The progeny from these transgenic plants become true breeding lines 
(International PCT application publication Nos. WO 00/60061 and EP 
1017268; US Patent Nos. 5,631,152; 5,955,362; 6,015,940; 6,013,523; 
6,096,546; 6,037,527; 6,153,812; Weissbach and Weissbach (1988) 
10 Methods for Plant Molecular Biology, Academic Press, Inc.; Fromm eta/. 
(1990) Bio/Technology 8:833-839; Gordon-Kamm eta/. (1990) Plant Cell 
2:603-608; Koziel eta/. (1993) Bio/Technology 11:194-200; and Golovkin et 
a/. (1993) Plant Sci. 90:41-52). 
1 . PACs 

15 Plant artificial chromosomes (PACs) are prepared by the in vivo and in 

vitro methods described herein. PACs may be prepared inside plant 
protoplasts and then transferred to plant targets, in particular to other plant 
protoplasts, via fusion in the presence or absence of PEG as described herein 
(Draper et at. (1982) Plant Cell Physiol. 23:451-458; Krens et al (1982) 

20 Nature 72-74). PACs arfe isolated from the protoplasts in which they were 
prepared, encapsulated into liposomes, and delivered to other plant 
protoplasts (Deshayes eta/. (1985) EMBO J. 4:2731-2737). Alternatively, 
the PACs are isolated and delivered directly to plant protoplasts, plant cells, 
or other plant targets via a PEG-mediated process, calcium phosphate- 

25 mediated process, electroporation, microinjection, sonoporation, or any 

method known in the art as described herein (Haim eta/. (1985) Mol. Gen. 
Genet. 199:161-168; Fromm eta/. (1986) Nature 319:791-793; Fromm et 
at. (1985) Proc. Nat. Acad. Sci. USA 82:5824-5828; Klein eta/. (1987) 
Nature 327:70; Klein et at. (1988) Proc. Nat. Acad. Sci. USA 85:8502-8505; 

30 and International PCT application publication no. WO 91/00358). 
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2. MACs 

Mammalian artificial chromosomes {MACs) are prepared by the in vivo 
and in vitro methods described in US Patent Nos. 6,025,155 and 6,077,697, 
and International PCT application No. WO 97/40183. MACs are prepared as 
5 microcells, and the microcells are fused with plant protoplasts in the 

presence or absence of PEG (Dudits et al. (1976) Hereditas 82:121-123; 
Wiegland eta!. (1987) J. Cell. Sci. Pt. 2 145-149). Alternatively, the MACs 
are isolated and delivered directly to plant cells, protoplasts, and other plant 
targets a PEG-mediated process, calcium phosphate-mediated process, 

10 electroporation, microinjection, sonoporation , or any method known in the 
art as described herein and in US Patent Nos. 6,025,155 and 6,077,697, 
and International PCT application publication No. WO 97/40183. 

After PACs or MACs are introduced into plant targets and the plant 
targets are grown and analyzed for transfection, the transformed plant 

15 targets are developed using standard conditions into roots, shoots, plantlets, 
or any structure capable of growing into a plant. Transgenic plants can, in 
turn, be generated by the planting of transformed roots, plantlets, seeds, 
seedlings and structures capable of growing into a plant. Transgenic 
plants can be propagated, for example, through seed, cuttings, or vegetative 

20 propagation. 

I. Applications and Uses of Artificial Chromosomes 

Artificial chromosomes provide convenient and useful vectors, and in 
some instances (e.g., in the case of very large heterologous genes) the only 
vectors, for introduction of heterologous genes into hosts. Virtually any 

25 gene of interest is amenable to introduction into a host via artificial 
chromosomes. 

As described herein, there are numerous methods for using artificial 
chromosomes to introduce coding sequences into plant cells. These include 
methods for using artificial chromosomes to express genes encoding 
30 commerically valuable enzymes and therapeutic compounds in plant cells, 
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introduction of agronomically important traits or applications related to the 
manipulation of large regions of DNA. 

The artificial chromosomes provided herein may be used in methods of 
protein and gene product production, particularly using plant cells as host 
5 cells for production of such products, and in cellular production systems in 
which the artificial chromosomes provide a reliable, stable and efficient 
means for optimizing the biomanufacturing of important compounds for 
medicine and industry. They are also intended for use in methods of gene 
therapy and for production of transgenic organisms, particularly plants 
10 (discussed above, below and in the EXAMPLES). 

1 . Production of products in plants 

Methods for expression of heterologous proteins in plant cells 
("molecular farming") are provided. At present, many foreign proteins have 
been expressed in whole plants or selected plant organs. Plants can offer a 

15 highly effective and economical means to produce recombinant proteins as 
they can be grown on a large scale at modest cost. The production of 
heterologous proteins in plants has included genes that are fused to strong 
constitutive plant promoters (e.g., 35S from cauliflower mosaic virus 
(Sijmons et al., 1990, Bio/Technology, 8:217-221, Benfey and Chua, US 

20 5,1 10,732, Fraley et al., US 5,858,742, McPherson and Kay, US 

5,359,142); seed specific promoters {Hall et al., US 5,504,200, Knauf et al., 
US 5,530,194, Thomas et al., US 5,905,186, Moloney, US 5,792,922, US 
5,948,682) or promoters active in other plant organs such as fruit (Radke et 
al., 1988, Theoret. Appl. Genet., 75:685-694, Bestwick et al., US 

25 5,783,394, Houck and Pear, US 4,943,674) or storage organs such as 

tubers (Rocha-Sosa et al., US 5,436,393, US 5,723,757). The genes under 
the control of these promoters can be any protein and include, for example, 
genes that encode receptors, cytokines, enzymes, proteases, hormones, 
growth factors, antibodies, tumor suppressor genes, vaccines, therapeutic 

30 products and multigene pathways. 
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For example, industrial enzymes that can be produced include, for 
example, a-amylase, glucanase, phytase and xylanase (see, Goddijn and Pen 
(1995) Trends BiotechnoL 73:379-387; Pen eta/. (1992) Bio/Technology 
70:292-296; Horvath eta/. (2000) Proc. Nat/. Acad. Sci. U.S.A. 97:1914- 
5 1919; and e.g., Herbers and Sonnewald (1996) in Transgenic P/ants: A 

Production System for Industrial and Pharmaceutical Proteins" Owen and Pen 
Eds., John Wiley & Sons, West Sussex, England), proteases such as 
subtilisin and other industrially important enzymes. Additional proteins that 
can be produced in crops by molecular farming include other industrial 

10 enzymes, for example, proteases, carbohydrate modifying enzymes such as 
glucose oxidase, cellulases, hemicellulases, xylanases, mannanases or 
pectinases, (e.g. Baszczynski et ah, US 5,824,870, US 5,767,379, Bruce et 
al., US 5,804,694). Additionally, the production of enzymes particularly 
valuable in the pulp and paper industry such as ligninases or xylanases also 

15 can be expressed, (Austin-Philips et al., US 5,981,835). Other examples of 
enzymes include phosphatases, oxidoreductases and phytases, (van Ooijen 
et al., US 5,714,474). 

Additionally, expression and delivery of vaccines in plants has been 
proposed(Arntzen and Lam, US 6,136,320, US, 5,914,123, Curtiss and 

20 Cardineau, US 5,679,880, US 5,679,880, US 5,654,184, Lam and Arntzen, 
US 5,612,487, US 6,034,298, Rymerson et al., W09937784A1, as well as 
antibodies (Conrad et al., WO 972900A1, Hein et al., US 5,959,177, Hiatt 
and Hein, US 5,202,422, US 5,639,947, Hiatt et al., US 6,046,037), 
peptide hormones (Vandekerckhove, J.S., US 5,487,991, Brandle et al., 

25 WO9967401 A2), blood factors and similar therapeutic molecules. 

Expression of vaccines in edible plants can provide a means for drug delivery 
which is cost effective and particularly suited for the administration of 
therapeutic agents in rural or under developed countries. The plant material 
containing the therapeutic agents could be cultivated and incorporated into 

30 the diet (Lam, D.M., and Arntzen, C.J., US 5,484,719). Similarly, plants 
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used for animal feed can be engineered to express veterinary biologies that 
can provide protection against animal disease, (Rymerson et al., 
W09937784A1). Antibodies also can be produced in plants, including, for 
example, a gene fusion encoding an antigen-binding single chain Fv protein 
5 (scFv) that recognizes the hapten oxazolone (Fiedler and Conrad (1995) 

Bio/Technology 73:1090-1093) and IgG (Ma et al. (1995) Science 2^8:716- 
719). Monoclonal antibodies for therapeutic and diagnostic applications are 
of particular interest. 

Examples of human biopharmaceuticals that can be expressed in 

10 plants include, but are not limited to, albumin (Sijmons et al. (1990)), 

enkephalins (Vandekerckhove et al. (1989) ), interferon-a (Zhu et al. (1994) 
and GM-CSF (Ganz et al. (1996) in Transgenic Plants: A Production System 
for Industrial and Pharmaceutical Proteins, Owen and Pen Eds., John Wiley & 
Sons, West Sussex, England, pp. 281-297; and Sardana et al. (1998) in 

1 5 Methods in Biotechnology, Vol. 3: Recombinant Proteins from Plants: 

Production and Isolation of Clinically Useful Compounds, Cunningham and 
Porter, Eds., Humana Press, New Jersey; pp. 77-87). 

Cells containing the artificial chromosomes provided herein can 
advantageously be used in in vitro plant cell-based systems for production of 

20 proteins, particularly several proteins from one cell line, such as multiple 
proteins involved in a biochemical pathway or multivalent vaccines. The 
genes encoding the proteins are introduced into the artificial chromosomes 
which are then introduced into plant cells. Plant cells useful for this purpose 
are those that grow well in culture, or most preferably, plant cells capable of 

25 being regenerated to whole plants. Plants can then be cultivated by common 
methods to produce plant material comprising said heterologous proteins. 
The heterologous proteins can be subject to purification or the plant tissue or 
extracts thereof can be used directly for vaccination, amelioration of disease, 
or processing of material, such as bleaching during pulp and paper 

30 processing or enzymatic conversion of industrial materials or feedstocks. 
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Alternatively, the heterologous gene(s) of interest are transferred into a 
production cell line or plant line that already contains artificial chromosomes 
in a manner that targets the gene(s) to the artificial chromosomes. The cells 
or plants are grown under conditions whereby the heterologous proteins are 
5 expressed. Because the proteins are expressed at high levels in a stable 
permanent extra-genomic chromosomal system, selective conditions are not 
required. 

Selection of host lines for use in artificial chromosome-based protein 
production systems is within the skill of the art, but often will depend on a 

10 variety of factors, including the properties of the heterologous protein to be 
produced, potential toxicity of the protein in the host cell, any requirements 
for post-translational modification ( e.g. . glycosylation, amination, 
phosphorylation) of the protein, transcription factors available in the cells, 
the type of promoter element(s) being used to drive expression of the 

15 heterologous gene, whether production is completely intracellular or the 
heterologous protein will preferably be secreted from the cell, or be 
sequestered or localized, and the types of processing enzymes in the cell. 

Artificial chromosomes can be engineered as platforms for the 
production of specific molecules in plant cells. For example, production of 

20 complex mammalian molecules, such as multichain antibodies, requires a 
number of protein activities not normally found in plant species. It is 
possible to produce an artificial chromosome that comprises all of the 
mamalian activities needed to produce human antibodies, correctly modified 
and processed, by introducing into an artificial chromosome the genes 

25 needed to carry out these activities. Said genes would be modified, for 

example, by placing each gene under the control of a plant promoter, or by 
placing the master control gene, i.e., a gene that controls expression of the 
various genes, under the control of a plant promoter. Alternatively, 
mammalian transcriptional control factors could be introduced, under the 
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control of plant active promoters, to be expressed in a plant cell and cause 
the expression of said target proteins, for example multichain antibodies. 

In this fashion, plant artificial chromosomes are developed, each 
capable of supporting the efficient production of a specific class of valuable 
5 products, for example, antibodies, blood clotting factors, etc. Thus, 

production of products within a class, for example, human antibodies would 
simply involve the introduction of a specific antibody coding sequence, 
without modification into the artificial chromosome engineered specifically for 
the production of human antibodies. The artificial chromosome would 
10 comprise all of the required genetic activities for the proper expression, 
translation and post-translational modification of human antibodies. Such 
artificial chromosomes can be used in a variety of applications, such as, but 
are not limited to, large scale production of numerous specific human 
antibodies. 

1 5 Advantages of plant cells as host cell lines in the production of 

recombinant proteins include, but are not limited to, the following: (1) 
proteins are post-translationally modified similar to mammalian systems, (2) 
plants can be directed to secrete proteins into stable, dry, intracellular 
compartments of seeds called endosperm protein bodies, which can easily be 

20 collected, (3) the amount of recombinant product that can be produced 

approaches industrial scale levels and (4) health risks due to contamination 
with potential pathogens/toxins are minimized. 

The artificial chromosome-based system for heterologous protein 
production has many advantageous features. For example, as described 

25 above, because the heterologous DNA is located in an independent, extra- 
genomic artificial chromosome (as opposed to randomly inserted in an 
unknown area of the host cell genome or located as extrachromosomal 
element(s) providing only transient expression), it is stably maintained in an 
active transcription unit and is not subject to ejection via recombination or 

30 elimination during cell division. Accordingly, it is unnecessary to include a 
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selection gene in the host cells and thus growth under selective conditions is 
also unnecessary. Furthermore, because the artificial chromosomes are 
capable of incorporating large segments of DNA # multiple copies of the 
heterologous gene and linked promoter element(s) can be retained in these 
5 chromosomes, thereby providing for high-level expression of the foreign 

protein(s). Alternatively, multiple copies of the gene can be linked to a single 
promoter element and several different genes can be linked in a fused 
polygene complex to a single promoter for expression of, for example, all the 
key proteins constituting a complete metabolic pathway (see, e.g. . Beck von 

10 Bodman et aL (1995) Biotechnology V3:587-591). Alternatively, multiple 
copies of a single gene can be operatively linked to a single promoter, or 
each or one or several copies can be linked to different promoters or multiple 
copies of the same promoter. Additionally, because artificial chromosomes 
have an almost unlimited capacity for integration and expression of foreign 

15 genes, they can be used not only for the expression of genes encoding end- 
products of interest, but also for the expression of genes associated with 
optimal maintenance and metabolic management of the host cell, e.g., genes 
encoding growth factors, as well as genes that facilitate rapid synthesis of 
correct form of the desired heterologous protein product, e.g., genes 

20 encoding processing enzymes and transcription factors as described above. 

The artificial chromosomes are suitable for expression of any proteins 
or peptides, including proteins and peptides that require in vivo 
posttranslational modification for their biological activity. Such proteins 
include, but are not limited to antibody fragments, full-length antibodies, and 

25 multimeric antibodies, tumor suppressor proteins, naturally occurring or 
artificial antibodies and enzymes, heat shock proteins, and others. 

Thus, such cell-based "protein factories" employing artificial 
chromosomes can be generated using artificial chromosomes constructed 
with multiple copies (theoretically an unlimited number or at least up to a 

30 number such that the resulting artificial chromosome is about up to the size 
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of a genomic chromosome {i.e., endogenous)) of protein-encoding genes with 
appropriate promoters, or multiple genes driven by a single promoter, i.e., a 
fused gene complex (such as a complete metabolic pathway in plant 
expression system; see, e.g. . Beck von Bodman (1995) Biotechnology 
5 1_3:587-591). Once such an artificial chromosome is constructed, it can be 
transferred to a suitable plant species capable of being propagated under 
field conditions, or under conditions that permit the recovery of the intended 
product. Plant cell cultures such as algae can be used in a system analogous 
to mammalian cell culture systems. The advantage of plant based systems 

10* such as this include low input costs for growth, rapid growth rates and 
ability to produce a large biomass economically. 

The ability of artificial chromosomes to provide for high-level 
expression of heterologous proteins in host cells is demonstrated, for 
example, by analysis of mammalian cells containing a mammalian artificial 

15 chromosome, H1D3 and G3D5 cell lines described herein. Northern blot 
analysis of mRNA obtained from these cells reveals that expression of the 
hygromycin-resistance and 0 -galactosidase genes in the cells correlates with 
the amplicon number of the megachromosome(s) contained therein. 

Transgenic plants producing these compounds are made by the 

20 introduction and expression of one or potentially many genes using the 
artificial chromosomes provided herein. The vast array of possibilities 
include, but are not limited to, any biological compound which is presently 
produced by any organism such as proteins, nucleic acids, primary and 
intermediary metabolites, carbohydrate polymers, enzymes for uses in 

25 bioremediation, enzymes for modifying pathways that produce secondary 

plant metabolites such as flavonoids or vitamins, enzymes that could produce 
pharmaceuticals and for introducing enzymes that could produce compounds 
of interest to the manufacturing industry such as specialty chemicals and 
plastics. The compounds are roduced by the plant, extracted upon harvest 

30 and/or processing, and used for any presently recognized useful purpose 
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such as pharmaceuticals, fragrances, and industrial enzymes. Alternatively, 
plants produced in accordance with the methods and compositions provided 
herein can be made to metabolize certain compounds, such as hazardous 
wastes, thereby allowing bioremediation of these compounds. 
5 The artificial chromosomes provided herein can be used in methods of 

protein and gene product production, particularly using plant cells as host 
cells for production of such products, and in cellular production systems in 
which the artificial chromosomes provide a reliable, stable and efficient 
means for optimizing the biomanufacturing of important compounds for 

10 medicine and industry. 

2. Genetic alteration of organisms to possess desired traits 
Artificial chromosomes are ideally suited for preparing organisms, such 
as plants, that possess certain desired traits, such as, for example, disease 
resistance, resistance to harsh environmental conditions, altered growth 

15 patterns and enhanced physical characteristics. With respect to plants, the 
choice of the particular nucleic acid that will be delivered to recipient cells via 
artificial chromosomes often will depend on the purpose of the 
transformation. One of the major purposes of transformation of crop and 
tree species is to add some commercially desirable, agronomically important 

20 traits to the plant. Such traits include, but are not limited to, input and 

output traits such as herbicide resistance or tolerance, insect resistance or 
tolerance, disease resistance or tolerance (viral, bacterial, fungal or 
nematode), stress tolerance and/or resistance, as exemplified by resistance 
or tolerance to drought, heat, chilling, freezing, excessive moisture, salt 

25 stress and oxidative stress, increased yields, food content and makeup, 

physical appearance, male sterility, drydown, standability, prolificacy, starch 
quantity and quality, oil quantity and quality, protein quantity and quality and 
amino acid composition. It may be desirable to incorporate one or more 
genes conferring such desirable traits into host plants. 
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a. Herbicide resistance 

The genes encoding phosphinothricin acetyltransferase (bar and pat), 
glyphosate tolerant EPSP synthase genes, the glyphosate degradative 
enzyme gene gox encoding glyphosate oxidoreductase, deh (encoding a 
5 dehalogenase enzyme that inactivates dalapon), herbicide resistant 

(e.g. sulfonylurea and imidazolinone) acetolactate synthase, and bxn genes 
(encoding a nitrilase enzyme that degrades bromoxynil) are all examples of 
herbicide resistant genes for use in plant transformation. The bar and pat 
genes code for an enzyme, phosphinothricin acetyltransferase (PAT), which 

10 inactivates the herbicide phosphinothricin and prevents this compound from 
inhibiting gluatamine synthetase enzymes. The enzyme 5- 
enolpyruvylshikimate 3-phosphate synthase (EPSP synthase) is normally 
inhibited by the herbicide N-(phosphonomethyl)glycine (glyphosate). 
However, genes are known that encode glyphosate-resistant EPSP synthase 

15 enzymes. The deh gene encodes the enzyme dalapon dehalogenase and 
confers resistance to the herbicide dalapon. The bxn gene codes for a 
specific nitrilase enzyme that converts bromoxynil to a non-herbicidal 
degradation product. 

b. Insect and other pest resistance 

20 Insect-resistant organisms may be prepared in which resistance or 

decreased susceptibility to insect-induced disease is conferred by 
introduction into the host organism or embryo of artificial chromosomes 
containing DNA encoding gene products {e.g., ribozymes and proteins that 
are toxic to certain pathogens) that destroy or attenuate pathogens or limit 

25 access of pathogens to the host. Potential insect resistance genes that can 
be introduced into plants via artificial chromosomes include Bacillus 
thuringiensis crystal toxin genes or Bt genes (see, e.g.,, Watrud et al. (1985) 
In Engineered Organisms and the Environment). Bt genes may provide 
resistance to lepidopteran or coleopteran pests such as the European Corn 

30 Borer (ECB). Such Bt toxin genes include the CrylA(b) and CrylA(c) genes. 
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Endotoxin genes from other species of B. thuringiensis which affect insect 
growth or development also may be employed in this regard. Bt gene 
sequences can be modified to effect increased expression in plants, and 
particularly monocot plants. Means for preparing synthetic genes are well 
5 known in the art and are disclosed in, for example, U.S. Patent Nos. 
5,500,365 and 5,689,052. Examples of such modified Bt toxin genes 
include a synthetic Bt Cry/Afb) gene (see, e.g., Perlak eta/. (1991) Proc. 
Natl. Acad. Sci. U.S.A. 88:3324-3328) and the synthetic Cry/Afc) gene 
termed 1800b (see PCT Application publication no. WO95/06128). 

10 Examples of the types of genes that may be transferred into plants via 

artificial chromosomes to generate disease- and/or insect-resistant transgenic 
plants include, but are not limited to, the crylA(b) and crylA(c) genes which 
yield products that are highly toxic to two major rice insect pests (the striped 
stem borer and the yellow stem borer) (see, e.g., Cheng et al. (1998) Proc. 

15 Natl. Acad. ScL U.S.A. 95:2767-2772), cry3 genes which encode products 
that are toxic to Coleopteran insects that attack a variety of plants, including 
grains and legumes (see, e.g., U.S. Patent No. 6,023,013), genes {e.g., DNA 
encoding tricothecene 3-O-acetyltransferase) that confer resistance to 
tricothecenes such as those produced by plant fungi {e.g., Fusarium) in 

20 plants particularly susceptible to fungi {e.g., wheat, rye, barley, oats, and 

maize) (see, e.g., PCT Application publication no. WO 00/60061), and genes 
involved in multi-gene biosynthetic pathways that yield antipathogenic 
substances that have a deleterious effect on the growth of plant pathogens 
(see, e.g., U.S. Patent No. 5,639,949). 

25 Protease inhibitors may also provide insect resistance (see, e.g., 

Johnson et al. (1989) and will thus have utility in plant transformation. The 
use of a protease inhibitor II gene, pinll, from tomato or potato may be 
particularly useful. The combined effect of the use of a pinll gene with a Bt 
toxin gene can produce synergistic insecticidal activity. Other genes that 

30 encode inhibitors of the insect's digestive system, or those that encode 
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enzymes or co-factors that facilitate the production of inhibitors, also may be 
useful. This group may be exemplified by oryzacystatin and amylase 
inhibitors such as those from wheat and barley. 

Genes encoding lectins may confer additional or alternative insecticide 
5 properties. Lectins (originally termed phytohemagglutinins) are multivalent 
carbohydrate-binding proteins which have the ability to agglutinate red blood 
cells from a range of species. Lectins have been identified as insecticidal 
agents with activity against weevils, ECB and rootworm (see, e.g., Murdock 
etal. (1990) Phytochemistry 2S:85-89; Czapla & Lang (1990) J. Econ. 

10 Entomol. 53:2480-2485). Lectin genes that may be useful include, for 
example, barley and wheat germ agglutinin (WGA) and rice lectins 
(Gatehouse etal. (1984) J. Sci. Food. Agric. 35:373-380). 

Genes controlling the production of large and small polypeptides active 
against insects when introduced into the insect pests, such as, for example, 

15 lytic peptides, peptide hormones and toxins and venoms, may also be useful 
in generating pest-resistant plants. For example, expression of juvenile 
hormone esterase, directed toward specific insect pests, also may result in 
insecticidal activity, or cause cessation of metamorphosis (see, e.g., 
Hammock etal. (1990) Nature 344:458-461). 

20 Transgenic plants expressing genes which encode enzymes that affect 

the integrity of the insect cuticle are additional examples of genes that may 
be transferred to plants via artificial chromosomes to confer resistance to 
insects. Such genes include those encoding, for example, chitinase, 
proteases, lipases and also genes for the production of nikkomycin, a 

25 compound that inhibits chitin synthesis, the introduction of any of which 
may be used to produce insect-resistant plants. Genes that affect insect 
molting, such as those affecting the production of ecdysteroid UDP-glucosyl 
transferase, also can be useful transgenes, 

Genes that code for enzymes that facilitate the production of 

30 compounds that reduce the nutritional quality of the host plant to insect 
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pests may also be used to confer insect resistance on plants. It may be 
possible, for instance, to confer insecticidal activity on a plant by altering its 
sterol composition. Sterols are obtained by insects from their diet and are 
used for hormone synthesis and membrane stability. Therefore, alterations in 
5 plant sterol composition by expression of genes that directly promote the 
production of undesirable sterols or those that convert desirable sterols into 
undesirable forms, could have a negative effect on insect growth and/or 
development and hence endow the plant with insecticidal activity. 
Lipoxygenases are naturally occurring plant enzymes that have been shown 

10 to exhibit anti-nutritional effects on insects and to reduce the nutritional 
quality of their diet. Therefore, transgenic plants with enhanced 
lipoxygenase activity may be resistant to insect feeding. 

Tripsacum dactyioides is a species of grass that is resistant to certain 
insects, including corn root worm. Tripsacum may thus include genes 

15 encoding proteins that are toxic to insects or are involved in the biosynthesis 
of compounds toxic to insects. Such genes may be useful in conferring 
resistance to insects. It is known that the basis of insect resistance in 
Tripsacum is genetic, because said resistance has been transferred to Zea 
mays via sexual crosses (Branson and Guss, 1972). It is further anticipated 

20 that other cereal, monocot or dicot plant species may have genes encoding 
proteins that are toxic to insects which would be useful for producing insect 
resistant plants. 

Further genes encoding proteins characterized as having potential 
insecticidal activity also may be used as transgenes in accordance herewith. 

25 Such genes include, for example, the cowpea trypsin inhibitor (CpT1: Hilder 
et a/., 1987) which may be used as a rootworm deterrent, genes encoding 
avermectin {Avermectin and Abamectin., Campbell, W.C., Ed., 1989: Ikeda 
etaL, 1987) which may prove particularly useful as a corn rootworm 
deterent, ribosome inactivating protein genes and even genes that regulate 

30 plant structures. Transgenic plants including anti-insect antibody genes and 
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genes that code for enzymes that can convert a non-toxic insecticide (pro- 
insecticide) applied to the outside of the plant into an insecticide inside the 
plant also are contemplated. 

c. Disease resistance 
5 Transgenic organisms, such as plants, that express genes that confer 

resistance or reduce susceptibility to disease are of particular interest. For 
example, the transgene may encode a protein that is toxic to a pathogen, 
such as a virus, fungus, mycotoxin-producing organism, nematode or 
bacterium, but that is not toxic to the transgenic host. 

10 Because multiple genes can be introduced on an artificial 

chromosome, a series of genes encoding a genetic pathway involved in 
disease resistance or tolerance can be introduced into crop plants. For 
example, it is known that often numerous genes are expressed upon 
pathogen invasion, typically one or more "PR", or pathogen related, proteins 

15 are expressed in response to invasion of a plant bacterial or fungal pathogen. 
One or more of the proteins involved in conferring resistance to pathogens 
can be contained within an artificial chromosome and therefore be expressed 
in a plant cell, in particular a whole transgenic plant as described herein. In 
addition, production of single-chain Fv recombinant antibodies in plants may 

20 extend the range of possibilities for the introduction of pathogen protection 
in crop plants (see, e.g., Tavladoraki et aL (1993) Nature 566:469-472). 

It has been demonstrated that expression of a viral coat protein in a 
transgenic plant can impart resistance to infection of the plant by that virus 
and perhaps other closely related viruses (Cuozzo et aL, 1988. Hemenway et 

25 aL, 1988, Abel etaL, 1986). Expression of antisense genes targeted at 

essential viral functions may also impart resistance to viruses. For example, 
an antisense gene targeted at the gene responsible for replication of viral 
nucleic acid may inhibit replication and lead to resistance to the virus. 
Interference with other viral functions through the use of antisense genes 

30 also may increase resistance to viruses. Further, it may be possible to 
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achieve resistance to viruses through other approaches, including, but not 
limited to the use of satellite viruses. Artificial chromosomes are ideally 
suited for carrying a multiplicity of these genes and DNA sequences which 
are useful for conferring a broad range of resistance to many pathogens. 
5 Genes encoding so-called "peptide antibiotics," pathogenesis related 

(PR) proteins, toxin resistance, and proteins affecting host-pathogen 
interactions such as morphological may also be useful, particularly in 
conferring increased resistance to diseases caused by bacteria and fungi. 
Peptide antibiotics are polypeptide sequences which are inhibitory to growth 

10 of bacteria and other microorganisms. For example, the classes of peptides 
referred to as cepropins and magainins inhibit growth of may species of 
bacteria and fungi. Expression of PR proteins in monocotyledonous plants 
such as maize may be useful in conferring resistance to bacterial disease. 
These genes are induced following pathogen attack on a host plant and have 

15 been divided into at lease five classes of proteins (Bio. Linthorst, and 

Cornelissen, 1990). Included among the PR proteins are 0-A, 3-glucanases, 
chitinases, and osmotin and other proteins that are believed to function in 
plant resistance to disease organisms. Other genes have been identified that 
have antifungal properties, e.g., UDA (stinging nettle lectin) and hevein 

20 (Broakaert et a/., 1989; Barkai-Golan et a/., 1978). It is known that certain 
plant diseases are caused by the production of phytotoxins. Resistance to 
these diseases may be achieved through expression of a gene that encodes 
an enzyme capable of degrading or otherwise inactivating the phytotoxin. It 
also is contemplated that expression of genes that alter the interactions 

25 between the host plant and pathogen may be useful in reducing the ability of 
the disease organism to invade the tissues of the host plant, e.g., an 
increase in the waxiness of the leaf cuticle or other morphological 
characteristics. 

d. Environment or stress resistance 
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Improvement of a plant's ability to tolerate various environmental 
stresses such as, but not limited to, drought, excess moisture, chilling, 
freezing, high temperature, salt, and oxidative stress, also can be effected 
through expression of genes therein. It is proposed that benefits may be 
5 realized in terms of increased resistance to freezing temperatures through the 
introduction of an "antifreeze" protein such as that of the Winter Flounder 
(Cutler era/., 1989) or synthetic gene derivatives thereof. Improved chilling 
tolerance also may be conferred through increased expression of glycerol-3- 
phosphate acetyltransf erase in chloroplasts (Wolter et aL, 1992). Resistance 

10 to oxidative stress in some crop species (often exacerbated by conditions 
such as chilling temperatures in combination with high light intensities) can 
be conferred by expression of superoxide dismutase (Gupta et a/., 1993), 
and may be improved by glutathione reductase (Bowler et al., 1992). Such 
strategies may allow for tolerance to freezing in newly emerged fields as well 

15 as extending later maturity higher yielding varieties to earlier relative maturity 
zones. 

It is contemplated that the expression of genes that favorably effect 
plant water content, total water potential, osmotic potential, and turgor will 
enhance the ability of the plant to tolerate drought. As used herein, the 

20 terms "drought resistance" and drought tolerance" are used to refer to a 

plant's increased resistance or tolerance to stress induced by a reduction in 
water availability, as compared to normal circumstances, and the ability of 
the plant to function and survive in lower-water environments. The 
expression of genes encoding for the biosynthesis of osmotically-active 

25 solutes, such as polyol compounds, may impart protection against drought. 
Within this class are genes encoding for mannitol-L-phosphate 
dehydrogenase (Lee and Saier, 1982) and trehalose-6-phosphate synthase 
(Kaasen et al., 1992). Through the subsequent action of native 
phosphatases in the cell or by the introduction and coexpression of a specific 

30 phosphatase, these introduced genes will result in the accumulation of either 
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mannitol or trehalose, respectively, both of which have been well 
documented as protective compounds able to mitigate the effects of stress. 
Mannitol accumulation in transgenic tobacco has been verified and 
preliminary results indicate that plants expressing high levels of this 
5 metabolite are able to tolerate an applied osmotic stress {Tarczynski era/., 
1992, 1993). 

Similarly, the efficacy of other metabolites in protecting either enzyme 
function (e.g., alanopine or propionic acid) or membrane integrity {e.g., 
alanopine) has been documented (Loomis eta/. f 1989), and therefore 

10 expression of genes encoding for the biosynthesis of these compounds might 
confer drought resistance in a manner similar to or complimentary to 
mannitol. Other examples of naturally occurring matabolites that are 
osmotically active and/or provide some direct protective effect during 
drought and/or desiccation include fructose, erythritol (Coxson et a/., 1992), 

15 sorbitol, dulcitol (Karsten et a/., 1 992), glucosylglycerol (Reed etaL, 1984; 
ErdMann et aL, 1992), sucrose, stachyose (Koster and Leopold, 1988: 
Blackman eta/., 1992), raffinose (Bernal-Lugo and Leopold, 1992), proline 
(Rensburg etaL, 1993), glycine betaine, ononitol and pinitol (Vernon and 
Bohnert, 1992). Continued canopy growth and increased reproductive 

20 fitness during times of stress will be augmented by introduction and 
expression of genes such as those controlling the osmotically active 
compounds discussed above and other such compounds. Genes which 
promote the synthesis of an osmotically active polyol compound include 
genes which encode the enzymes mannitol- 1 -phosphate dehydrogenase, 

25 trehalose-6-phosphate synthase and myoinositol O-methyltransferase. 

Artificial chromosomes can carry a multiplicity of genes to provide durable 
stress tolerance, for example, concominant expression of proline and ketane 
and/or poly-ols. 

It is contemplated that the expression of specific proteins also may 
30 increase drought tolerance under certain conditions or in certain crop 
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species. These may include proteins such as Late Embryogenic Proteins (see 
Dure era/., 1989). All three classes of LEAs have been demonstrated in 
maturing (i.e. desiccating) seeds. Within LEA proteins, the Type-M (dehydrin- 
type) have generally been implicated in drought and/or desiccation tolerance 
5 in vegetative plant parts {i.e. Mundy and Chua, 1988: Piatkowski eta/., 

1990: Yamaguchi-Shinozaki et at., 1992). Recently, expression of a Type-Ill 
LEA (HVA-1) in tobacco was found to influence plant height, maturity and 
drought tolerance (Fitzpatrick, 1993). In rice, expression of the HVA-1 gene 
influenced tolerance to water deficit and salinity (Xu et al ., 1996). 

10 Expression of structural genes from all three LEA groups may therefore 
confer drought tolerance. Other types of proteins induced during water 
stress include thiol proteases, aldolases and transmembrane transporters 
(Guerrero et a!., 1999), which may confer various protective and/or repair- 
type functions during drought stress. It is also is contemplated that genes 

15 that effect lipid biosynthesis and hence membrane composition might also be 
useful in conferring drought resistance on the plant. 

Many of these genes for improving drought resistance have 
complementary modes of action. Thus, combinations of these genes might 
have additive and/or synergistic effects in improving drought resistance in 

20 plants. Many of these genes also improve freezing tolerance (or resistance): 
the physical stresses incurred during freezing and drought are similar in 
nature and may be mitigated in similar fashion. Benefit may be conferred via 
constitutive expression of these genes, but the preferred means of 
expressing these genes may be through the use of a turgor-induced promoter 

25 (such as the promoters for the turgor-induced genes described in Guerrero et 
al., 1990 and Shagan eta/., 1993 which are incorporated herein by 
reference). Spatial and temporal expression patterns of these genes may 
enable plants to better withstand stress. 

It is proposed that expression of genes that are involved with specific 

30 morphological traits that allow for increased water extractions from drying 
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is possible for as few as 50 clones to represent the entire micro- 
megachromosome. 

a. Centromeres 
An exemplary centromere for use in the construction of an artificial 
5 chromosome is that contained within a megachromosome, such as those 
described herein. One example of a particular megachromosome-containing 
cell line provided is, for example, H1D3 and derivatives thereof, such as 
mM2C1 cells. Megachromosomes are isolated from such cell lines utilizing, 
for example, the procedures described herein, and the centromeric sequence 

10 is extracted from the isolated megachromosomes. For example, the 
megachromosomes may be separated into fragments utilizing selected 
restriction endonucleases that recognize and cut at sites that, for instance, 
are primarily located in the replication and/or heterologous DNA integration 
sites and/or in the satellite DNA. Based on the sizes of the resulting 

15 fragments, certain undesired elements may be separated from the 

centromere-containing sequences. The centromere-containing DNA could be 
as large as 1 Mb. 

Probes that specifically recognize centromeric sequences, such as 
mouse minor satellite DNA-based probes [see, e.g. , Wong et aL (1988) Nucl. 

20 Acids Res. 16 :11645-11661], pCT4.2 probe, a 3.5 kb fragment of 
Arabidopsis 5S rDNA (Campbell et al. (1992) Gene 1 72:225-228), 
Arabidopsis cosmids E4.1 1 (30kb) adn E4.6 (33 kb, Bent et al. (1994) 
Science 265:1856-1860; and 180 bp pAL1 repeat sequence (Maluszynska et 
al. (1991) Plant J. 7:159-166; and Martinez-Zapater et al. (1986) Mol. Gen. 

25 Genet. 204:417-423) may be used to isolate a centromere-containing YAC, 
BAC or PAC clone derived from the megachromosome. Alternatively, or in 
conjunction with the direct identification of centromere-containing 
megachromosomal DNA, probes that specifically recognize the non- 
centromeric elements, such as probes specific for mouse major satellite DNA, 
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soil would be of benefit. For example, introduction and expression of genes 
that alter root characteristics may enhance water uptake. It also is 
contemplated that expression of genes that enhance reproductive fitness 
during times of stress would be of significant value. For example, expression 
5 of genes that improve the synchrony of pollen shed and receptiveness of the 
female flower parts, i.e., silks, would be of benefit. In addition it is 
proposed that expression of genes that minimize kernel abortion during times 
of stress would increase the amount of grain to be harvested and hence be 
of value. 

10 Given the overall role of water in determining yield, it is contemplated 

that enabling plants to utilize water more efficiently, through the introduction 
and expression of genes, will improve overall performance even when soil 
water availability is not limiting. By introducing genes that improve the 
ability of plants to maximize water usage across a full range of stresses 

15 relating to water availability, yield stability or consistency of yield 
performance may be realized. 

e. Plant agronomic characteristics 
Plants possessing desired traits that might, for example, enhance 
utility, processibility and commercial value of the organisms in areas such as 

20 the agricultural and ornamental plant industries may also be generated using 
artificial chromosomes in the same manner as described above for production 
of disease-resistant organisms. In such instances, the artificial chromosomes 
that are introduced into the organism or embryo contain DNA encoding gene 
products that serve to confer the desired trait in the organism. 

25 For example, transgenic plants having improved flavor properties, 

stability and/or quality are of commercial interest. One possible method for 
generating such plants may include the expression of transgenes, e.g., genes 
encoding cystathionine gamma synthase (CGS), that result in increased free 
methionine levels (see, e.g., PCT Application publication no. WO 00/55303). 
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Two of the factors determining where crop plants can be grown are 
the average daily temperature during the growing season and the length of 
time between frosts. Within the areas where it is possible to grow a 
particular crop, there are varying limitations on the maximal time it is allowed 
5 to grow to maturity and be harvested. For example, a variety to be grown in 
a particular area is selected for its ability to mature and dry down to 
harvestable moisture content within the required period of time with 
maximum possible yield. Therefore, crops of varying maturities are 
developed for different growing locations. Apart from the need to dry down 

10 sufficiently to permit harvest, it is desirable to have maximal drying take 
place in the field to minimize the amount of energy required for additional 
drying post-harvest. Also, the more readily a product such as grain can dry 
down, the more time there is available for growth and kernel fill. Genes that 
influence maturity and/or dry down can be identified and introduced into 

15 plant lines using transformation techniques to create new varieties adapted 
to different growing locations or the same growing location, but having 
improved yield to moisture ratio at harvest. Expression of genes that are 
involved in regulation of plant development may be especially useful. 
Genes that would improve standability and other plant growth 

20 characteristics may also be introduced into plants. Expression of new genes 
in plants which confer stronger stalks, improved root systems, or prevent or 
reduce ear droppage would be of great value to the farmer. Introduction and 
expression of genes that increase the total amount of photoassimilate 
available by, for example, increasing light distribution and/or interception 

25 would be advantageous. In addition, the expression of genes that increase 
the efficiency of photosynthesis and/or the leaf canopy would further 
increase gains in productivity. Expression of a phytochrome gene in crop 
plants may be advantageous. Expression of such a gene may be reduce 
apical dominance, confer semidwarfism on a plant, and increase shade 
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tolerance (U.S. Patent No. 5,268,526). Such approaches would allow for 
increased plant populations in the field. 

f. Nutrient utilization 

The ability to utilize available nutrients may be a limiting factor in 
5 growth of crop plants. It may be possible to alter nutrient uptake, tolerate 
pH extremes, mobilization through the plant, storage pools, and availability 
for metabolic activities by the introduction of new agents. These 
modifications would allow a plant such as maize to more efficiently utilize 
available nutrients. An increase in the activity of, for example, an enzyme 

10 that is normally present in the plant and involved in nutrient utilization may 
increase the availability of a nutrient. An example of such an enzyme would 
be phytase. It is further contemplated that enhanced nitrogen utilization by a 
plant is desirable. Expression of a glutamate dehydrogenase gene in plants, 
e.g., E. coli gdhA genes, may lead to enhanced resistance to the herbicide 

15 glufosinate by incorporation of excess ammonia into glutamate, thereby 
detoxifying the ammonia. Gene expression may make a nutrient source 
available that was previously not accessible, e.g., an enzyme that releases a 
component of nutrient value from a more complex molecule, perhaps a 
macromolecule. Alternatively, artificial chromosomes can carry the 

20 multiplicity of genes governing nodulation and nitrogen fixation in legumes. 
The artificial chromosomes could be used to promote nodulation in non- 
legume species. 

g. Male sterility 

Male sterility is useful in the production of hybrid seed. Male sterility 
25 may be produced through gene expression. For example, it has been shown 
that expression of genes that encode proteins that interfere with 
development of the male inflorescence and/or gametophyte result in male 
sterility. Chimeric ribonuclease genes that express in the anthers of 
transgenic tobacco and oilseed rape have been demonstrated to lead to male 
30 sterility (Mariani eta/., 1990). Other methods of conferring male sterility 
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have been described, including gene encoding antisense RNA capable of 
causing male sterility (U.S. Patent Nos. 6,184,439, 6,191,343 and 
5,728,926) and methods utilizing two genes to confer sterility, see, e.g., 
U.S. Patent No. 5,426,041. 
5 A number of mutations were discovered in maize that confer 

cytoplasmic male sterility. One mutation in particular, referred to as T 
cytoplasm, also correlates with sensitivity to Southern corn leaf blight. A 
DNA sequence, designated TURF-13 (Levings, 1990), was identified that 
correlates with T cytoplasm. It is proposed that it would be possible through 

10 the introduction of TURF-13 via transformation, to separate male sterility 

from disease sensitivity. As it is necessary to be able to restore male fertility 
for breeding purposes and for grain production, it is proposed that genes 
encoding restoration of male fertility also may be introduced, 
h. Improved nutritional content 

15 Genes may be introduced into plants to improve the nutrient quality or 

content of a particular crop. Introduction of genes that alter the nutrient 
composition of a crop may greatly enhance the feed or food value. For 
example, the protein of many grains is suboptimal for feed and food purposes 
especially when fed to pigs, poultry, and humans. The protein is deficient in 

20 several amino acids that are essential in the diet of these species, requiring 
the addition of supplements to the grain. Limiting essential amino acids may 
include lysine, methionine, tryptophan, threonine, valine, arginine, and 
histidine. Some amino acids become limiting only after corn is supplemented 
with other inputs for feed formulations. The levels of these essential amino 

25 acids in seeds and grain may be elevated by mechanisms which include, but 
are not limited to, the introduction of genes to increase the biosynthesis of 
the amino acids, increase the storage of the amino acids in proteins, or 
increase transport of the amino acids to the seeds or grain, 

The protein composition of a crop may be altered to improve the 

30 balance of amino acids in a variety of ways including elevating expression of 
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native proteins, decreasing expression of those with poor composition 
changing the composition of native proteins, or introducing genes encoding 
entirely new proteins possessing superior composition. 

The introduction of genes that alter the oil content of a crop plant may 
5 also be of value. Increases in oil content may result in increases in 

metabolizable-energy-content and density of seeds for use in feed and food. 
The introduced genes may encode enzymes that remove or reduce rate- 
limitations or regulated steps in fatty acid or lipid biosynthesis. Such genes 
may include, but are not limited to, those that encode acetyl-CoA 

10 carboxylase, ACP-acyltransf erase, £-ketoacyl-ACP synthase, plus other well 
known fatty acid biosynthetic activities. Other possibilities are genes that 
encode proteins that do not possess enzymatic activity such as acyl-carrier 
proteins. Genes may be introduced that alter the balance of fatty acids 
present in the oil providing a more healthful or nutritive feedstuff. The 

15 introduced DNA also may encode sequences that block expression of 

enzymes involved in fatty acid biosynthesis, altering the proportions of fatty 
acids present in crops. 

Genes may be introduced that enhance the nutritive value of the 
starch component of crops, for example by increasing, or in some cases 

20 decreasing, the degree of branching, resulting in improved utilization of the 
starch in livestock by delaying its metabolism. Additionally, other major 
constituents of a crop may be altered, including genes that affect a variety of 
other nutritive, processing, or other quality aspects. For example, 
pigmentation may be increased or decreased. 

25 Feed or food crops may also possesses insufficient quantities of 

vitamins, requiring supplementation to provide adequate nutritive value. 
Introduction of genes that enhance vitamins biosynthesis may be envisioned 
including, for example, vitamins A (e.g. rice with Vitamin A or golden rice), 
E, B12 choline, and the like. Mineral content may also be sub-optimal. Thus 

30 genes that affect the accumulation or availability of compounds containing 
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phosphorus, sulfur, calcium, manganese, zinc, and iron among others would 
be valuable. 

Numerous other examples of improvements of crops may be effected 
using the artificial chromosomes, with appropriate heterologous genes 
5 contained therein, in accordance with the methods and compositions 

provided herein. The improvements may not necessarily involve grain, but 
may, for example, improve the value of a crop for silage. Introduction of 
DNA to accomplish this might include sequences that alter lignin production 
such as those that result in the "brown midrib" phenotype associated with 

10 superior feed value for cattle. 

In addition to direct improvements in feed or food value, genes also 
may be introduced which improve the processing of crops and improve the 
value of the products resulting from the processing. One use of crops is via 
wetmilling. Thus, genes that increase the efficiency and reduce the cost of 

15 such processing, for example, by decreasing steeping time may also find use. 
Improving the value of wetmilling products may include altering the quantity 
or quality of starch, oil, corn gluten meal, or the components of gluten feed. 
Elevation of starch may be achieved through the identification and 
elimination of rate limiting steps in starch biosynthesis or by decreasing 

20 levels of the other components of crops resulting in proportional increases in 
starch. 

Oil is another product of wetmilling, the value of which may be 
improved by introduction and expression of genes. Oil properties maybe be 
altered to improve its performance in the production and use of cooking oil, 

25 shortenings, lubricants or other oil-derived products or improvements of its 
health attributes when used in the food-related applications. Fatty acids also 
may be synthesized which upon extraction can serve as starting materials for 
chemical syntheses. The changes in oil properties may be achieved by 
altering the type, level, or lipid arrangement of the fatty acids present in the 

30 oil. This in turn may be accomplished by the addition of genes that encode 
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enzymes that catalyze the synthesis of new fatty acids and the lipids 
possessing them or by increasing levels of native fatty acids while possibly 
reducing levels of precursors. Alternatively, DNA sequences may be 
introduced which slow or block steps in fatty acid biosynthesis resulting in 
5 the increase in precursor fatty acid intermediates. Genes that might be 

added include desaturases, epoxidases, hydratases, dehydratases and other 
enzymes that catalyze reactions involving fatty acid intermediates. 
Representative examples of catalytic steps that might be blocked include the 
desaturations from stearic to oleic acid and oleic to linolenic acid resulting in 

10 the respective accumulations of stearic and oleic acids. Another example is 
the blockage of elongation steps resulting in the accumulation of C8 to C12 
saturated fatty acids. 

i. Production of chemicals or biologicals 
Transgenic plants can be used as protein production systems to 

15 generate recombinant products ranging from industrial enzymes, viral 

antigens, vaccines, antibodies, human blood proteins, cytokines, growth 
factors, enkephalins, serum albumin and other proteins of clinical relevance 
and pharmaceuticals. For example, enzymes including cr-amylase, glucanase, 
phytase and xylanase (see, Goddijn and Pen (1995) Trends Biotechnol. 

20 73:379-387; Pen et al. (1992) Bio/Technology 70:292-296; Horvath et al. 
(2000) Proc. Natl. Acad. ScL U.S.A. 97:1914-1919; and e.g., Herbers and 
Sonnewald (1996) in Transgenic Plants: A Production System for Industrial 
and Pharmaceutical Proteins" Owen and Pen Eds., John Wiley & Sons, West 
Sussex, England). 

25 Examples of medically relevant proteins that may be produced in 

plants include surface antigens of viral pathogens, such as hepatitis B virus 
and transmissible gastroenteritis virus spike protein, for use in vaccines. The 
proteins thus produced may be isolated and administered through standard 
vaccine introduction methods or through the consumption of the edible 

30 transgenic plant as food which can be taken orally (see, e.g., U.S. Patent No. 
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6,136,320 and Mason et aL (1992) Proc. Natl. Acad. Set. U.S.A. 53:11745- 
11749). HIV, rhinovirus, malarial and rabies virus antigens are additional 
examples of that may be expressed in plants as candidate vaccines (see, 
e.g., Porta et aL (1994) ViroL 202:949-955; Turpen et aL (1995) 
5 Bio/Technology 73:53-57; and McGarvey et aL (1995) Bio/Technology 

73:1484-1487). Antibodies may also be produced in plants, including, for 
example, a gene fusion encoding an antigen-binding single chain Fv protein 
(scFv) that recognizes the hapten oxazolone (Fiedler and Conrad (1995) 
Bio/Technology 73:1090-1093) and IgG (Ma et aL (1995) Science 268\1^%- 
10 719). 

Examples of human biopharmaceuticals that may be expressed in 
plants include, but are not limited to, albumin (Sijmons et aL (1990)), 
enkephalins (Vandekerckhove et aL (1989) ), interferon-o (Zhu et aL (1994) 
and GM-CSF (Ganz et aL (1996) in Transgenic Plants: A Production System 

15 for Industrial and Pharmaceutical Proteins, Owen and Pen Eds., John Wiley & 
Sons, West Sussex, England, pp. 281-297; and Sardana et aL (1998) in 
Methods in Biotechnology, Vol. 3: Recombinant Proteins from Plants: 
Production and Isolation of Clinically Useful Compounds, Cunningham and 
Porter, Eds., Humana Press, New Jersey; pp. 77-87). 

20 Transgenic plants producing these compounds are made possible by 

the introduction and expression of one or potentially many genes using the 
artificial chromosomes provided herein. The vast array of possibilities 
include, but are not limited to, any biological compound which is presently 
produced by any organism such as proteins, nucleic acids, primary and 

25 intermediary metabolites, carbohydrate polymers, enzymes for uses in 

bioremediation, enzymes for modifying pathways that produce secondary 
plant metabolites such as flavonoids or vitamins, enzymes that could produce 
pharmaceuticals and for introducing enzymes that could produce compounds 
of interest to the manufacturing industry such as specialty chemicals and 

30 plastics. The compounds may be produced by the plant, extracted upon 
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harvest and/or processing, and used for any presently recognized useful 
purpose such as pharmaceuticals, fragrances, and industrial enzymes to 
name a few. Alternatively, plants produced in accordance with the methods 
and compositions provided herein may be made to metabolize certain 
5 compounds, such as hazardous wastes, thereby allowing bioremediation of 
these compounds. 

j. Non-protein-expressing sequences 
Nucleic acids may be introduced into plants that are designed to 
down-regulate or supress a plant-encoded gene. A number of different means 

10 to achieve down regulation have been demonstrated in the art, including 

antisense RNA, ribozymes and co-suppression. The use of antisense RNA to 
suppress plant genes is described, for example, in U.S. Patent Nos. 
4,801,540, 5,107,065 and 5,453,566. In such methods, an "antisense" 
gene is constructed that encodes an RNA that is complementary to the 

15 mRNA of a resident plant gene, such that expression of the antisense gene 
inhibits the translation of the mRNA of the resident plant gene. Thus, the 
activity of the resident gene is down-regulated. 

An additional method of down regulating gene activities involves 
ribozymes, or catalytic hammerhead hairpin RNA structures. The use of 

20 ribozymes is described, for example, in U.S. Patent Nos. 4,987,071, 
5,037,746, 5,1 16,742 and 5,354,855. These methods rely on the 
expression of small catalytic "hammerhead" RNA molecules that are capable 
of binding to and cleaving specific RNA sequences. Ribozymes designed to 
specifically recognize a resident plant mRNA can be used to cleave the 

25 mRNA and prevent its proper expression. 

Essentially a more or less equivalent down-regulation control of gene 
activities by ribozymes and antisense can be achieved by adding additional 
copies of the gene to be regulated. The process is referred to as co- 
suppression and is described in, for example, U.S. Patent Nos. 5,034,323, 

30 5,283,184 and 5,231,020. 
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Numerous plant genes may be targeted for down regulation. For 
example, a gene may be down-regulated that encodes an enzyme that 
catalyzes a reaction in a plant. Reduction of the enzyme activity may reduce 
or eliminate products of the reaction which include any enzymatically 
5 synthesized compound in the plnat such as fatty acids, amino acids, 

carbohydrates, nucleic acids and the like. Alternatively, the protein may be a 
storage protein, such as zein, or a structural protein, the decreased 
expression of which may lead to changes in seed amino acid composition or 
plant morphological changes, respectively. The possibilities cited above are 
10 provided only by way of example and do not represent the full range of 
applications. 

(1). Antisense RNA 

Genes may be constructed, which when transcribed, produce 
antisense RNA that is complementary to all or part(s) of a targeted 

15 messenger RNA(s). The antisense RNA reduces production of the 

polypeptide product of the messenger RNA. The polypeptide product may be 
any protein encoded by the plant genome. The aforementioned genes will be 
referred to as antisense genes. An antisense gene may thus be introduced 
into a plant by transformation methods to produce a transgenic plant with 

20 reduced expression of a selected protein of interest. For example, the 

protein may be an enzyme that catalyzes a reaction in the plant. Reduction 
of the enzyme activity may reduce or eliminate products of the reaction 
which include any enzymatically synthesized compound in the plant such as 
fatty acids, amino acids, carbohydrates, nucleic acids and the like. 

25 Alternatively, the protein may be a storage protein, such as a zein, or a 

structural protein, the decreased expression of which may lead to changes in 
seed amino acid composition or plant morphological changes respectively. 
The possibilities cited above are provided only by way of example and do not 
represent the full range of applications. 

30 (2.) Ribozymes 
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Genes also may be constructed or isolated, which when transcribed, 
produce RNA enzymes (ribozymes) which can act as endoribonucleases and 
catalyze the cleavage of RNA molecules with selected sequences. The 
cleavage of selected messenger RIMAs can result in the reduced production of 
5 their encoded polypeptide products! These genes may be used to prepare 
transgenic plants which possess them. The transgenic plants may possess 
reduced levels of polypeptides including, but not limited to, the polypeptides 
cited above. 

Ribozymes are RNA-protein complexes that cleave nucleic acids in a 
10 site-specific fashion. Ribozymes have specific catalytic domains that 

possess endonuclease activity (Kim and Cech, 1987; Gerlach etal., 1987; 
Forster and Symons, 1987). For example, a large number of ribozymes 
accelerate phosphoester transfer reactions with a high degree of specificity, 
often cleaving only one of several phophoesters in an oligonucleotide 
15 substrate (Cech eta/., 1981; Michel and Westhof, 1990); Reinhold-Hurek 
and Shub, 1992). This specificity has been attributed to the requirement 
that the substrate bind via specific base-pairing interactions to the internal 
guide sequence ("IGS") of the ribozyme prior to chemical reaction. 

Ribozyme catalysis has primarily been observed as part of sequence- 
20 specific cleavage/iigation reactions involving nucleic acids (Joyce, 1989; 

Cech etaf., 1981). For example, U.S. Patent 5,354,855 reports that certain 
ribozymes can act as endonucleases with a sequence specificity greater than 
that of known ribonucleases and approaching that of the DNA restriction 
enzymes. 

25 Several different ribozyme motifs have been described with RNA 

cleavage activity (Symons, 1992). Examples include sequences from the 
Group I self splicing introns including Tobacco Ringspot Virus (Prody etal., 
1986), Avacado Sunblotch Viroid (Palukaitis etal., 1979; Symons, 1981) 
and Lucerne Transient Streak Virus (Forster and Symons, 1987). Sequences 
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from these and related viruses are referred to as hammerhead ribozyme 
based on a predicted folded secondary structure. 

Other suitable ribozymes include sequences from RNase P with RNA 
cleavage activity (Yuan eta/., 1992; Yuan and Altman, 1994; U.S. Patents 
5 5,168,053 and 5,624,824), hairpin ribozyme structures (Berzal-Herranz et 
aL, 1992; Chowrira et aL, 1993) and Hepatitis Delta virus based ribozymes 
(U.S. Patent 5,625,047). The general design and optimization of ribozyme 
directed RNA cleavage activity has been discussed in detail (Haselhoff and 
Gerlach, 1988; Symons, 1992; Chowrira etaL, 1994; Thompson et aL, 
10 1995). 

The other variable on ribozyme design is the selection of a cleavage 
site on a given target RNA. Ribozymes are targeted to a given sequence by 
virtue of annealing to a site by complementary base pair interactions. Two 
stretches of homology are required for this targeting. These stretches of 

15 homologous sequences flank the catalytic ribozyme structure defined above. 
Each stretch of homologous sequence can vary in length from 7 to 1 5 
nucleotides. The only requirement for defining the homologous sequences is 
that, on the target RNA, they are separated by a specific sequence which is 
the cleavage site. For hammerhead ribozyme, the cleavage site is a 

20 dinucleotide sequence on the target RNA is a uracil (U) followed by either an 
adenine, cytosine or uracil (A, C or U) (Perriman et aL, 1992; Thompson et 
aL, 1995). The frequency of this dinucleotide occurring in any given RNA is 
statistically 3 out of 16. Therefore, for a given target messenger RNA of 
1,000 bases, 187 dinucleotide cleavage sites are statistically possible. 

25 Designing and testing ribozymes for efficient cleavage of a target RNA 

is a process well known to those skilled in the art. Examples of scientific 
methods for designing and testing ribozymes are described by Chowrira et aL 
(1994) and Lieber and Strauss (1995), each incorporated by reference. The 
identification of operative and preferred sequences for use in down regulating 

30 a given gene is simply a matter of preparing and testing a given sequence, 
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and is a routinely practiced "screening" method known to those of skill in the 
art. 

(3.) Induction of gene silencing 

it also is possible that genes may be introduced to produce transgenic 
5 plants which have reduced expression of a native gene product by the 

mechanism of co-suppression. It has been demonstrated in tobacco, tomato, 
and petunia (Goring et at., 1991; Smith et aL, 1990; Napoii et al., 1990; van 
der Krol et aL, 1990) that expression of the sense transcript of a native gene 
will reduce or eliminate expression of the native gene in a manner similar to 

10 that observed for antisense genes. The introduced gene may encode all or 
part of the targeting native protein but its translation may not be required for 
reduction of levels of that native protein. 

(4.) IMon-RIMA-expressing sequences 
DNA elements including those of transposable elements such as Ds, 

15 Ac, or MU, may be inserted into a gene to cause mutations. These DNA 
elements may be inserted in order to inactivate (or activate) a gene and 
thereby "tag" a particular trait. In this instance the transposable element 
does not cause instability of the tagged mutation, because the utility of the 
element does not depend on its ability to move in the genome. Once a 

20 desired trait is tagged, the introduced DNA sequence may be used to clone 
the corresponding gene, e.g., using the introduced DNA sequence as a PCR 
primer together with PCR gene cloning techniques (Shapiro, 1 983; Dellaporta 
etaL, 1988). Once identified, the entire gene(s) for the particular trait, 
including control or regulatory regions where desired, may be isolated, cloned 

25 and manipulated as desired. The utility of DNA elements introduced into an 
organism for purposes of gene tagging is independent of the DNA sequence 
and does not depend on any biological activity of the DNA sequence, i.e., 
transcription into RNA or translation into protein. The sole function of the 
DNA element is to disrupt the DNA sequence of a gene. 
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It is contemplated that unexpressed DNA sequences, including 
synthetic sequences, could be introduced into cells as proprietary "labels" of 
those cells and plants and seeds thereof. It would not be necessary for a 
label DNA element to disrupt the function of a gene endogenous to the host 
5 organism, as the sole function of this DNA would be to identify the origin of 
the organism. For example, one could introduce a unique DNA sequence into 
a plant and this DNA element would identify all cells, plants, and progeny of 
these cells as having arisen from that labeled source. It is proposed that 
inclusion of label DNAs would enable one to distinguish proprietary 

10 germplasm or germplasm derived from such, from unlabelled germplasm. 

Another possible element which may be introduced is a matrix 
attachment region element (MAR), such as the chicken lysozyme A element 
{Stief, 1989), which can be positioned around an expressible gene of interest 
to effect an increase in overall expression of the gene and diminish position 

15 dependent effects upon incorporation into the plant genome (Stief et a/., 

1989; Phi- Van eta/., 1990). Sequences such as MARs can be included on 

the artificial chromosome to enhance gene expression. 

3. Transgenic models for evaluation of genes and discovery of 
new traits 

20 Of significant interest is the use of plants and plant cells containing 

artificial chromosomes for the evaluation of new genetic combinations and 
discovery of new traits. Artificial chromosomes, by virtue of the fact that 
they can contain significant amounts of DNA can also therefore encode 
numerous genes and accordingly a multiplicity of traits. It is contemplated 

25 here that artificial chromosomes, when formed from one plant species, can 
be evaluated in a second plant species. The resultant phenotypic changes 
observed, for example, can indicate the nature of the genes contained within 
the DNA containing the artificial chromosome, and hence permit the 
identification of new genetic activities. Artificial chromsomes containing 

30 euchromatic DNA or partially containing euchromatic DNA can serve as a 



WO 2002/096923 



PCT/US2002/017451 



-160- 

valuable source of new traits when transferred to an alien plant cell 
environment. For example, it is contemplated that artificial chromosomes 
derived from dicot plant species can be introduced into monocot plant 
species by transfering a dicot artificial chromosome. The dicot artificial 
5 chromosome containing a region of euchromatic DNA containing expressed 
genes. 

The artificial chromosomes can be generated or manipulated in such a 
fashion that a large region of naturally occurring plant DNA becomes 
incorporated into the artificial chromosome. This allows the artificial 

10 chromosome to contain new genetic activities and hence carry new traits. 
For example, an artificial chromosome can be introduced into a wild relative 
of a crop plant under conditions whereby a portion of the DNA present in the 
chromosomes of the wild relative is transferred to the artificial chromosome. 
After isolation of the artificial chromosome, this naturally occurring region of 

1 5 DNA from the wild relative, now located on the artificial chromosome can be 
introduced into the domesticated crop species and the genes encoded within 
the transferred DNA expressed and evaluated for utility. New traits and gene 
systems can be discovered in this fashion. 

Artificial chromosomes modified to recombine with plant DNA offer 

20 many advantages for the discovery and evaluation of traits in different plant 
species. When the artificial chromosome containing DNA from one plant 
species is introduced into a new plant species, new traits and genes can be 
introduced. This use of an artificial chromosome allows for the ability to 
overcome the sexual barrier that prevents transfer of genes from one plant 

25 species to another species. Using artificial chromosomes in this fashion 

allows for many potentially valuable traits to be identified including traits that 
are typically found in wild species. Other valuable applications for artificial 
chromosomes include the ability to transfer large regions of DNA from one 
plant species to another, DNA encoding potentially valuable traits such as 

30 altered oil, carbohydrate or protein composition, multiple genes encoding 
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enzymes capable of producing valuable plant secondary metabolites, genetic 
systems encoding valuable agronomic traits such as disease and insect 
resistance, genes encoding functions that allow association with soil 
bacterium such as growth promoting bacteria or nitrogen fixing bacteria, or 
5 genes encoding traits that confer freezing, drought or other stress tolerances. 
In this fashion, artificial chromosomes can be used to discover regions of 
plant DNA that encode valuable traits. 

The artificial chromosome can also be designed to allow the transfer 
and subsequent incorporation of these valuable traits now located on the 

10 artificial chromosome into the natural chromosomes of a plant species. In 
this fashion the artificial chromosomes can be used to transfer large regions 
of DNA encoding traits normally found in one plant species into another plant 
species. In this fashion, it is possible to derive a plant cell that no longer 
needs to carry an artificial chromosome to posses the new trait. Thus the 

1 5 artificial chromosome would serve as the transfer mechanism to permit the 
formation of plants with greater degree of genetic diversity. 

An artificial chromosome can be designed in a variety of ways to 
accomplish the afore-mentioned purposes. An artificial chromosome can be 
modified to contain sequences that promote homologous recombination 

20 within plant cells, or be modified to contain a genetic system that functions 
as a site-specific recombination system. For example, the DNA sequence of 
Arabidopsis is now known. To construct an artificial chromosome capable of 
recombining with a specific region of Arabidopsis DNA, a sequence of 
Arabidopsis DNA, normally located near a chromosomal location encoding 

25 genes of potential interest can be introduced into an artificial chromosome by 
methods provided herein. It may be desireable to include a second region of 
DNA within the artificial chromosome that provides a second flanking 
sequence to the region encoding genes of potential interest, to promote a 
double recombination event which would ensure transfer of the entire 

30 chromosomal region encoding genes of potential interest to the artificial 
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chromosome. The modified artificial chromosome, containing the DNA 
sequences capable of homologous recombination region can then be 
introduced into Arabidopsis cells and the homologous recombination event is 
selected. 

5 It is convenient to include a marker gene to allow for the selection of a 

homologous recombination event. The marker gene is preferably inactive 
unless activated by an appropriate homologous recombination event. For 
example, US 5,272,071, describes a method where an inactive plant gene is 
activated by a recombination event such that desired homologous 

10 recombination events can be easily scored. Similarly, US 5,501,967 

describes a method for the selection of homologous recombination events by 
activation of a silent selection gene first introduced into the plant DNA, the 
gene being activated by an appropriate homologous recombination event. 
Both of these methods can be applied to enable a selective process to be 

15 included in to select for recombination between an artificial chromosome and 
a plant chromosome. Once the homologous recombination event is 
detected, the artificial chromosome, once selected, is isolated and introduced 
into a recipient cell, for example, tobacco, corn, wheat or rice, and the 
expression of the newly introduced DNA sequences evaluated. Selection of 

20 recombinant events can take place in cell culture, or following seed formation 
and screening of seedling plants or seed itself. 

Phenotypic changes in the recipient plant cells containing the artificial 
chromosome, or in regenerated plants containing the artificial chromosome, 
allows for the evaluation of the nature of the traits encoded by the genes of 

25 interest, for example, Arabidopsis DNA, under conditions naturally found in 
plant cells, including the naturally occurring arrangement of DNA sequences 
responsible for the developmental control of the traits in the normal 
chromosomal environment. 

Traits such as durable fungal or bacterial disease resistance, new oil and 

30 carbohydrate compositions, valuable secondary metabolites such as 
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phytosterols, flavonoids, efficient nitrogen fixation or mineral utilization, 
resistance to extremes of drought, heat or cold are all found within different 
populations of plant species and are often governed by multiple genes. The use 
of single gene transformation technologies does not permit the evaluation of the 
5 multiplicity of genes controlling many valuable traits. Thus, incorporation of 
these genes into artificial chromosomes allows the rapid evaluation of the utility 
of these genetic combinations in heterologous plant species. 

The large scale order and structure of the artificial chromosome provides 
a number of unique advantages in screening for new utilities or new phenotypes 

10 within heterologous plant species. The size of new DNA that can be carried by 
an artificial chromosome can be millions of base pairs of DNA, representing 
potentially numerous genes that may have different or new utility in a 
heterologous plant cell. The artificial chromosome is a "natural" environment 
for gene expression, the problems of variable gene expression and silencing 

15 seen for genes transferred by random insertion into a genome should not be 
observed. Similarly, there is no need to engineer the genes for expression, and 
the genes inserted would not need to be recombinant genes. Thus, transferred 
genes are fully expected to be expressed in the typical temporal and spatial 
fashion as observed in the species from where the genes were initially isolated. 

20 A valuable feature for these utilities is the ability to isolate the artificial 
chromosomes and to further isolate, manipulate and introduce into other cells 
artificial chromosomes carrying unique genetic compositions. 

Thus, the use of artificial chromosomes and homologous recombination 
in plant cells can be used to isolate and identify many valuable crop traits. In 

25 addition to the use of artificial chromosomes for the isolation and testing of 
large regions of naturally occurring DNA, methods for the use of artificial 
chromosomes and cloned DNA are also contemplated. Similar to that described 
above, artificial chromsomes can be used to carry large regions of cloned DNA, 
including that derived from other plant species. 
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The ability to incorporate DNA elements into artificial chromosomes as 
they are being formed allows for the development of artificial chromosomes 
specifically engineered as a platform for testing of new genetic combinations, 
or "genomic" discoveries for model species such as Arabidopsis. Specific 
5 "recombinase" systems can be used in plant cells to excise or re-arrange genes; 
these same systems can be used to derive new gene combinations contained 
on an artificial chromosome. In this regard, it is contemplated that the use of 
site specific recombination sequences can have considerable utility in 
developing artificial chromosomes containing DNA sequences recognized by 

10 recombinase enzymes and capable of accepting DNA sequences containing 
same. The use of site-specific recombination as a means to target an 
introduced DNA to a specific locus has been demonstrated in the art and such 
methods can be employed. The recombinase systems can also be used to 
transfer the cloned DNA regions contained within the artificial chromosome to 

15 the naturally occurring plant chromosomes. 

Many site specific recombinases have been described in the literature 
(Kilby et aL, Trends in Genetics, 9(12): 413-418, 1993). Among these are: 
an activity identified as R encoded by the pSR1 plasmid of Zygosaccharomyes 
rouxii, FLP encoded for the 2um circular plasmid from Saccharomyces 

20 cerevisiae and Cre-lox from the phage P1 . 

The integration function of site specific recombinases is contemplated as 
a means to assist in the derivation of genetic combinations on artificial 
chromosomes. In order to accomplish this, it is contemplated that a first step 
of introducing site-specific recombinase sites into the genome of a plant cell in 

25 an essentially random manner is conducted, such that the plant cell has one or 
more site-specific recombinase recognition sequences on one or more of the 
plant chromosomes. An artificial chromosome is then introduced into the pant 
cell, the artificial chromosome engineered to contain a recombinase recognition 
site capable of being recognized by a site specific recombinase. Optionally a 

30 gene encoding a recombinase enzyme is also included, preferably under the 
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control of an inducible promoter. Expression of the site specific recombinase 
enzyme in the plant cell, either by induction of a inducible recombinase gene, 
or transient expression of a recombinase sequence causes a site-specific 
recombination event to take place, leading to the insertion of a region of the 
5 plant chromosomal DNA containing the recombinase recognition site into the 
recombinase recognition site of the artificial chromosome, forming an artificial 
chromosome containing plant chromosomal DNA. The artificial chromosome 
can be isolated and introduced into a heterologous host, preferably a plant host, 
and expression of the newly introduced plant chromosomal DNA can be 

10 monitored and evaluated for desirable phenotypic changes. Accordingly, 
carrying out this recombination with a population of plant cells wherein the 
chromosomally located recombinase recognition site is randomly scattered 
throughout the chromosomes of the plant can lead to the formation of a 
population of artificial chromosomes, each with a different region of plant 

15 chromosomal DNA, each representing a new genetic combination. 

This particular method involves the precise site-specific insertion of 
chromosomal DNA into the artificial chromosome. This precision has been 
demonstrated in the art. For example, Fukushige and Sauer (Proc. Natl. Acad. 
Sci. USA, 89:7905-7909, 1 992) demonstrated that the Cre-lox homologous 

20 recombination system could be successfully employed to introduce DNA into a 
predefined locus in a chromosome of mammalian cells. In this demonstration 
a promoter-less antibiotic resistance gene modified to include a /ox sequence at 
the 5' end of the coding region was introduced into CHO cells. Cells were re- 
transformed by electroporation with a plasmid that contained a promoter with 

25 a fox sequence and a transiently expressed Cre recombinase gene. Under the 
conditions employed, the expression of the Cre enzyme catalyzed the 
homologous recombination between the fox site in the chromosomally located 
promoter-less antibiotic resistance gene and the fox site in the introduced 
promoter sequence leading to the formation of a functional antibiotic resistance 

30 gene. The authors demonstrated efficient and correct targeting of the 
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introduced sequence, 54 of 56 lines analyzed corresponded to the predicted 
single copy insertion of the DNA due to Cre catalyzed site specific homologous 
recombination between the lox sequences. 

The use of the same Cre-lox system has been demonstrated in plants 
5 (Dale and Ow, Gene 91:79-85, 1995) to specifically excise, delete or insert 
DNA. The precise event is controlled by the orientation of /ox DNA sequences, 
in cis the /ox sequences direct the Cre recombinase to either delete {/ox 
sequences in direct orientation) or invert (lox sequences in inverted orientation) 
DNA flanked by the sequences, while in trans the /ox sequences can direct a 

10 homologous recombination event resulting in the insertion of a recombinant 
DNA. Accordingly a lox sequence may be first added to a genome of a plant 
species capable of being transformed and regenerated to a whole plant to serve 
as a recombinase target DNA sequence for recombination with an artificial 
chromosome. The /ox sequence may be optimally modified to further contain 

15 a selectable marker which is inactive but can be activated by insertion of the /ox 
recombinase recognition sequence into the artificial chromosome. 

A promoterless marker gene or selectable marker gene linked to the 
recombinase recognition sequence, which is f irst inserted into the chromosomes 
of a plant cell can be used to engineer a platform chromosome. A promoter is 

20 linked to a recombinase recognition site, in an orientation that allows the 
promoter to control the expression of the marker or selectable marker gene 
upon recombination within the artificial chromosome. Upon a site-specific 
recombination event between a recombinase recognition site in a plant 
chromosome and the recombinase recognition site within the the introduced 

25 artificial chromosome, a cell is derived with a recombined artificial chromosome, 
the artificial chromosome containing an active marker or selectable marker 
acitivity that permits the identification and or selection of the cell. 

The artificial chromosomes can be transferred to other plant species and 
the functionality of the new combinations tested. The ability to conduct such 

30 an inter-chromosomal transfer of sequences has been demonstrated in the art. 
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For example, the use of the Cre-lox recombinase system to cause a 
chromosome recombination event between two chromatids of different 
chromosomes has been shown 

Any number of recombination systems may be employed (see, U.S. 
5 provisional application Serial No. filed the same day herewith under attorney 
docket no. 24601 -P420). Such systems include, but are not limited to, 
bacterially derived systems such as the Int/att system of phage lambda and the 
Gin/gr/x system. 

More than one recombination system may be employed, including, for 
10 example, one recombinase system for the introduction of DNA into an artificial 
chromosome, and a second recombinase system for the subsequent transfer of 
the newly introduced DNA contained within an artificial chromosome into the 
naturally occurring chromosome of a second plant species. The choice of the 
specific recombination system used will be dependent on the nature of the 
15 modification contemplated. 

By having the ability to isolate an artificial chromosome and in particular 
artificial chromosomes containing plant chromosomal DNA introduced via site- 
specific recombination and re-introduce the chromosome into other cells, 
particularly plant cells, these new combinations can be evaluated in different 
20 crop species without the need to first isolate and modify the genes, or carry out 
multiple transformations or gene transfers to achieve the same combination 
isolation and testing combinations of the genes in plants. The use of a site 
specific recombinase and artificial chromosomes also allows the convenient 
recovery of the plant chromosomal region into other recombinant DNA vectors 
25 and systems for manipulation and study. 

The artificial chromosomes can be engineered as platforms to accept 
large regions of cloned DNA, such as that contained in Bacterial Artificial 
Chromosomes (BACs) or Yeast Artificial Chromosomes (YACs). It is further 
contemplated, that as a result of the typical structure of amplification-based 
30 artificial chromosomes, such as, for example, SATACS (or ACes), containing 
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tandemly repreated DNA blocks, that more than cloned DNA sequence can be 
introduced by recombination processes. In particular recombination within a 
predefined region of the tandemly repreated DNA within the artifical 
chromosome provides a mechanism to "stack" numerous regions of cloned 
5 DNA, including large regions of DNA contained within BACs or YACs clones. 
Thus, multiple combinations of genes can be introduced onto artificial 
chromosomes and these combinations tested for functionality. In particular, it 
is contemplated that multiple YACs or BACs can be stacked onto an artificial 
chromsomes, the BACs or YACs containing multiple genes of complex 
10 pathways or mutlipe genetic pathways. The BACs or YACs are typically 
selected based on genetic information available within the public domain, for 
example from the Arabidopsis Information Management System 
(http://aims.cps.msu.edu/aims/index.html) orthe information related tothe plant 
DNA sequences available from the Institute for Genomic Research 

15 (http://www.tigr.org) and other sites known to those skilled in the art. 
Alternatively, clones can be chosen at random and evaluated for functionality. 
It is contemplated that combinations providing a desired phenotype can be 
identified by isolation of the artificial chromosome containing the combination 
and analyzing the nature of the inserted cloned DNA. 

20 In another embodiment of the methods provided herein for discovering 

genes associated with plant traits, the artificial chromosome used to transfer 
plant DNA to a host cell for evaluation therein will contain large regions of plant 
DNA, in particular plant euchromatin, as a result of the process by which the 
artificial chromosome is produced. In particular, the artificial chromosome may 

25 be an amplification-based artificial chromosome, including, but not limited to: 
(1 ) a minichromosome arising from breakage of a dicentric chromosome, (2) an 
artificial chromosome containing one or more regions of repeating nucleic acid 
units wherein the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid, (3) an artificial chromosome 

30 containing one or more regions of repeating nucleic acid units wherein the 
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repeat region(s) is made up predominantly of euchromatic DNA or contains 
about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater than 
90% euchromatic DNA, (4) an artificial chromosome containing one or more 
regions of repeating nucleic acid units wherein the artificial chromosome is 
5 made up of substantially equivalent amounts of heterochromatin and 
euchromatin, (5) an artificial chromosome that containing one or more regions 
of repeating nucleic acid units having common nucleic acid sequences that 
represent euchromatic and heterochromatic nucleic acid and (6) a sausage-like 
structure that contains a portion or all of a euchromatin-containing arm of a 

10 plant chromosome. 

In these methods for discovering genes associated with plant traits, 
because the artificial chromosome used to transfer plant DNA to a host cell for 
evaluation therein is generated to already contain large amounts of plant DNA, 
in particular plant euchromatin, there is no need to introduce plant euchromatin 

15 into the artificial chromosomes, by homologous or site-specific recombination. 

4. Use of artificial chromosomes for preparation and screening of 
libraries 

Since large fragments of DNA can be incorporated into artificial 
chromosomes <ACs), they are well-suited for use as cloning vehicles that can 
20 accommodate entire genomes in the preparation of genomic DNA libraries, 
which then can be readily screened for functionality as described above or for 
specific gene sequences for further modification and study. For example, it is 
possible to use artificial chromosomes to prepare artificial chromosome libraries 
containing plant genomic DNA library useful in the identification and isolation 
25 of functional DNA components such as genes, centromeric DNA and telomeric 
DNA from a variety of different species of plants. 

The following examples are included for illustrative purposes only and are 
not intended to limit the scope of the invention. 

Example 1 

30 Generation of Arabidopsis protoplasts 
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Plant protoplasts are typically generated from plant cells following 
standard techniques (for example, Maheshwari etaL, Crit. Rev. Plant Sci. 
14: 149-1 78, 1 995; Ramulu etaL, Methods in Molecular Biology 111 227-242, 
1999). Typically plant protoplasts are prepared from fresh plant tissue, e.g., 
5 leaf, or can be prepared by converting cell suspension cultures to protoplasts 
by removal of the cell walls enzymatically. For production of Arabidopsis 
protoplasts, the methods of Karesh etaL (Plant Cell Reports 9: 575-578, 1 991 ) 
and Mathur etaL (Plant Cell Reports 74:21-226, 1995) were used to generate 
Arabidopsis suspension cultures by modifications thereof as described below. 
10 These cells were maintained in liquid culture and subcultured as required, 
usually between 7 and 10 days in culture. 

Establishment of suspension cultures 

Cell suspension cultures derived from root callus of Arabidopsis thaliana 
cv. Columbia, RLD and Landsburg I erecta'were used. Calli were induced from 
15 roots of 3 week-old seedlings on callus induction medium containing MS basic 
media (Murashige and Skoog (1962) Physiol. Plant 75:473-497) with 3% 
sucrose, 0.5mg/l napthalene acetic acid (NAA), 0.05 mg/l Kinetin {Sigman 
Aldrich Canada). The cell suspension cultures were grown from the calli in 
liquid callus induction medium at 22°C with shaking at 120 rpm. They were 
20 subcultured every 7 days. 

Generation of protoplasts 

One gram of 4-5 day-old suspension culture was incubated in 6 ml 
enzyme solution containing 1% Cellulase 'Onozuka' R-10 and 0.25% 
Macerozyme R-10 in 35 g/l CaCI 2 -2H 2 0 (Hartmann etaL (1998) Plant MoL Biol. 

25 36:741 -754) and incubated at 22°C in the dark with shaking at 70 rpm for 1 5 
h. The protoplast mixture was poured through a 1 00 jjm nylon mesh sieve and 
centrifuged at 250xg for 5 min. The protoplasts were washed with 35 g/l 
CaCI 2 -2H 2 0 and resuspended in 10 ml floating medium containing B5 medium 
(Gamborg etaL (1968) Exp. Ceil Res. 50:151-158) with 144 g/l sucrose and 1 

30 mg/l 2,4-dichlorophenoxyacetic acid (2,4-D). The protoplasts were centrifuged 
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at 80xg for 10 min, collected at the interface and used immediately for 
transfection. 

Example 2 

Generation of Tobacco Mesophyll Protoplasts 

5 Mesophyll protoplasts were generated from leaves of sterile plantlets of N. 
tabacumcv. Xanthi. The plantlets were grown aseptically on MSO medium (MS 
basal media, 3% sucrose, 0.05% morpholinoethanesulfonic acid (MES), 1.0 
mg/l benzyl adenine (BA), 0.1 mg/l NAA and 0.8% agar, pH 5.8) at 22°C under 
a 16/8 h photoperiod (see also Bilang et a! (1994) Plant Molecular Biology 

10 Manual A 1 :1 -6). Fully expanded leaves (2x4 cm) were cut in half, the main 
vein removed and the upper epidermis scored with parallel cuts. Leaf pieces 
were immersed in 6 ml enzyme solution containing 1 .2% Cellulase 'Onozuka' 
R-10 and 0.4% Macerozyme R-10 in K4 medium (Nagy and Maliga (1976) Z. 
Pflanzenpysiol. 75:453-455) and incubated at 22°C for 15 h without shaking. 

15 The protoplasts were purified by pouring through a 100 pm nylon mesh sieve. 
Suspension of protoplasts was carefully overlayed with 1 ml W5 solution (Bilang 
etal. (1 994) Plant Molecular Biology Manual A 1:1 -6) and centrifuged at 80xg 
for 10 min. Protoplasts were then resuspended in W5 solution at a density of 
1 x 1 0 6 protoplasts/ml and stored at 4° C for 1 to 2 hours prior to treatment, for 

20 example, DNA uptake or chromosome transfer. 

Example 3 

Production of Tobacco Protoplasts from Suspension Cultures 

Tobacco BY-2 protoplasts are prepared from suspension cultures according 
to the method of Nagata et al. [(1981) Molecular and General Genetics, 
25 /S4:161-165]. 

Example 4 

Generation of Brassica Hypocotyl Protoplasts 

Genotypes of Brassica napus, B. oleracea, B. juncea and B. carinata may 
be used to generate protoplasts. Seeds of Brassica napus were 
30 surface-sterilized (for 2 min with 70% ethanol, then for 20 min with 2.4% 
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sodium hypochlorite containing one drop of Tween 20 per 1 00 ml) . Seeds were 
rinsed thoroughly with sterile distilled water and grown aseptically on 
autoclaved germination medium (half-strength basal Murashige and Skoog's 
medium (MS), 1% sucrose, 0.8% agar, pH 5.8). Unless otherwise indicated, 
5 the protoplast generation procedures were performed aseptically and solutions 
and media were filter-sterilized. Alternatively, protoplasts can be generated and 
cultured successfully from different explants using various protocol 
modifications (for example, Kao et al. (1991) Plant Science 75:63-72; Kao et 
al. (1990) Plant Cell Rep. 5:31 1-315; Kao and Seguin-Swartz (1987) Plant Cell 
10 T/'ss. Org, Cult. 70:79-90; Kao (1977) Mol. Gen. Genet. 750:225-230). 
Generation of Hypocotyl Protoplasts 

Hypocoty Is were excised from 4 or 5 day-old seedlings grown aseptically 
in the dark with or without light exposure for a few hours prior to use. The 
explants were cut transversely into 2-5 mm pieces and incubated in enzyme 

15 solution (salts, vitamins and organic acids of Kao's medium (Kao (1977) Mol. 
Gen. Genet. 750:225-230), 0.4 g/l CaCI 2 '2H 2 0, 13% sucrose, 1% 
Cellulase'Onozuka R10', 0.1% Pectolyase Y23, pH 5.6) in petri dishes, in 
darkness, without agitation for 14-18 hours, then with agitation on a rotary 
shaker (ca. 50 rpm) for 15-30 min. 

20 The mixture was filtered through a 63 pm nylon screen into centrifuge 
tubes, and an equal volume of 17.5% sucrose was added to each tube. 
Following centrifugation (ca. lOOxg, 8 min), the protoplast band that formed at 
the top of each tube was collected. Protoplasts were washed 3 times by 
resuspension in wash solution [solution W5 of Menczel and Wolfe (1984, Plant 

25 Cell Rep 3:196-198) at a reduced strength (0.8X)] followed by centrifugation 
at 100xg for 3-5 min and discarding the supernatant. 

Protoplasts were cultured in Kao's medium containing the salts, vitamins 
and organic acids with 30 g/l sucrose, 68.4 g/l glucose, 0.5 mg/l NAA, 0.5 mg/l 
BA, 0.5 mg/l 2,4-D, pH 5.7, at a density of 1 X 10 5 per ml and incubated at 

30 25°C, 16 h photoperiod, in dim fluorescent light (25 pEm' 2 s' 1 ). 
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After 5-8 days in culture, 1-1 .5 ml of feeder medium containing the above 
medium except with 55.8 g/l glucose instead of 68.4 g/l, were added to each 
dish, and the dishes were placed under brighter fluorescent light (50 A/Em 2 s" 1 ). 
At about 14 days, 1-2 ml of medium were removed from each dish, and 2-3 ml 
5 of feeder medium containing basal B5 medium (Gamborg eta/. (1968) Exp. Cell 
Res. 50: 151-1 58), 3% sucrose, 3.8% glucose, 0.5 mg/l BA, 0.5 mg/l NAA, and 
0.5 mg/l 2,4-D, pH 5.7, were added. At about 21 days, if microcolonies have 
not yet formed, the cultures can be fed with the last feeder medium except with 
2.2% glucose instead of 3.8%. Protoplast cultures can be washed when 
10 necessary by adding new feeder medium, gently swirling petri dishes, allowing 
cells to settle, removing most of the supernatant and adding fresh medium to 
the dishes. 

At 3-5 weeks, microcolonies were embedded with medium containing a 1 :1 
mixture of the last feeder medium and proliferation medium which contains the 
15 components of the feeder medium with 0.9% glucose and 1.6% agarose to 
make a concentration of 0.8% in the final mixture. Cultures were incubated as 
described above in bright fluorescent light (80-1 00 pEm' 2 s" 1 ). After 10 days-2 
weeks, green colonies were plated onto the regeneration medium. 

Example 5 

20 Preparation of a Transformation Vector Useful for the Induction of 

Plant Artificial Chromosome Formation 

Plant artificial chromosomes (PACs) can be generated by introducing 

nucleic acid, such as DNA, which can include an amplification-inducing DNA 

and/or a targeting DNA, for example rDNA or lambda DNA, into a plant cell, 

25 allowing the cell to grow, and then identifying from among the resulting cells 
those that include a chromosome with a structure that is distinct from that of 
any chromosome that existed in the cell prior to introduction of the nucleic acid. 
The structure of a PAC reflects amplification of chromosomal DNA, for example, 
segmented, repeat region-containing and heterochromatic structures. It is also 

30 possible to select cells that contain structures that are precursors to PACs, for 
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exarnple, chromosomes containing more than one centromere and/or fragments 
thereof, and culture and/or manipulate them to ultimately generate a PAC within 
the cell. 

In the method of generating PACs, the nucleic acid can be introduced 
5 into a variety of plant cells. The nucleic acid can include targeting DNA and/or 
a plant expressable DNA encoding one or multiple selectable markers {e.g. , DNA 
encoding bialophos (bar) resistance) or scorable markers (e.g., DNA encoding 
GFP). Examples of targeting DNA include, but are not limited to, N. tabacum 
rDNA intergenic spacer sequence (IGS) and Arabidopsis rDNA such as the 1 8S, 

10 5.8S, 26S rDNA and/or the intergenic spacer sequence. The DNA can be 
introduced using a variety of methods, including, but not limited to 
Agrobacter/um-med\ated methods, PEG-mediated DNA uptake and 
electroporation using, for example, standard procedures according toHartmann 
era/ [{1998) Plant Molecular Biology 35:741], The cell into which such DNA 

15 is introduced can be grown under selective conditions and can initially be grown 
under non-selective conditions and then transferred to selective media. The 
cells or protoplasts can be placed on plates containing a selection agent to 
grow, for example, individual calli. Resistant calli can be scored for scorable 
marker expression. Metaphase spreads of resistance cultures can be prepared, 

20 and the metaphase chromosomes examined by FISH analysis using specific 
probes in order to detect amplification of regions of the chromosomes. Cells 
that have artificial chromosomes with functioning centromeres or artificial 
chromosomal intermediate structures, including, but not limited to, dicentric 
chromosomes, formerly dicentric chromosomes, minichromosomes, 

25 heterochromatin structures (e.g. sausage chromosomes), and stable self- 
replicating artificial chromosomal intermediates as described herein, are 
identified and cultured. In particular, the cells containing self -replicating artificial 
chromosomes are identified. 
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The DNA introduced into a plant cell for the generation of PACs can be 
in any form, including in the form of a vector. An exemplary, vector for use in 
methods of generating PACs can be prepared as follows. 

For the production of artificial chromosomes, plant transformation 
5 vectors, as exemplified by pAglla and pAgllb, containing a selectable marker, 
a targeting sequence, and a scorable marker were constructed using procedures 
well known in the art to combine the various fragments. The vectors can be 
prepared using vector pAg1 as a base vector and inserting the following DNA 
fragments into pAg1: DNA encoding /?-glucoronidase under the control of the 

10 nopaline synthase (NOS) promoter fragment and flanked at the 3' end by the 
NOS terminator fragment, a fragment of mouse satellite DNA and an N. 
tabacum rDNA intergenic spacer sequence (IGS). In constructing plant 
transformation vectors, vector pAg2 can also be used as the base vector. 
1. Construction of pAG1 

15 Vector pAg1 (SEQ. ID. NO: 1; see Figure 1) is a derivative of the 

GAMBIA vector named pCambia 3300 (Center for the Application of Molecular 
Biology to International Agriculture, i.e., CAMBIA, Canberra, Australia; 
www.cambia.org), which is a modified version of vector pCambia 1300 to 
which has been added DNA from the bar gene confering resistance to 

20 phosphinothricin. The nucleotide sequence of pCambia 3300 is provided in 
SEQ. ID. NO: 2. pCambia 3300 also contains a lacZ alpha sequence containing 
a polylinker region. 

pAgl was constructed by inserting two new functional DNA fragments 
into the polylinker of pCambia 3300: one sequence containing an attBsite and 

25 a promoterless zeomycin resistance-encoding DNA flanked at the 3' end by a 
SV40 polyA signal sequence, and a second sequence containing DNA from the 
hygromycin resistance gene (hygromycin phosphotransferase) confering 
resistance to hygromycin for selection in plants. Although the zeomycin-SV40 
polyA signal fusion is not expected to provide the basis for zeomycin selection 

30 in plant cells, it can be activated in mammalian cells by insertion of a functional 
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promoter element into the attB site by site-specific recombination catalyzed by 
the Lambda att integrase. Thus, the inclusion of the attB-zeomycin sequences 
allows for evaluation of functionality of plant artificial chromosomes in 
mammalian cells by activation of the zeomycin resistance-encoding DNA, and 
5 provides an att site for further insertion of new DNA sequences into plant 
artificial chromosomes formed as a result of using pAg1 for plant 
transformation. The second functional DNA fragment allows for selection of 
plant cells with hygromycin. Thus, pAg1 contains DNA from the bar gene 
confering resistance to phosphinothricin, DNA from the hygromycin resistance 

10 gene, both resistance-encoding DNAs under the control of a separate 
cauliflower mosaic virus (CaMV) 35S promoter, and the attB-promoterless 
zeomycin resistance-encoding DNA. 

pAg1 is a binary vector containing Agrobacterium right and left T-DNA 
border sequences for use in Agrobacterium-medlated transformation of plant 

1 5 cells or protoplasts with the DNA located between the border sequences. pAg1 
also contains the pBR322 Ori for replication in E.coff. pAg1 was constructed 
by ligating ////?cflll/Psfl-digested p3300attBZeo with AZ/m/lll/f'sfl-digested 
pBSCaMV35SHyg as follows (see Figure 2). 
a. Generation of p3300attBZeo 

20 Plasmid pCambia 3300 was digested with Pstl/EcH 36 II and ligated with 

Psfl/Sful-digested pLITattBZeo (the nucleotide sequence of pLITattBZeo is 
provided in SEQ. ID. NO: 19 to generate p3300attBZeo which contains an attB 
site, a promoterless zeomycin resistance-encoding DNA flanked at the 3' end 
by a SV40 polyA signal, and a reconstructed Pst\ site. 

25 b. Generation of pBSCaMV35SHyg 

A DNA fragment containing DNA encoding hygromycin 
phosphotransferase flanked by the CaMV 35S promoter and the CaMV 35S 
polyA signal sequence was obtained by PCR amplification of plasmid pCambia 
1302 (GenBank Accession No. AF234298 and SEQ. ID. NO: 3). The primers 

30 used in the amplification reaction were as follows: 
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CaMV35SpolyA: 

5'-CTGAATTAACGCCGAATTAATTCGGGGGATCTG-3' SEQ. ID. NO: 4 
CaMV35Spr: 

5'-CTAGAGCAGCTTGCCAACATGGTGGAGCA-3' SEQ. ID. NO: 5 
5 The 2 1 0O-bp PCR fragment was ligated with fcoRV-digested pBluescript II SK + 
(Stratagene, La Jolla, CA, U.S.A.) to generate pBSCaMV35SHyg. 
c. Generation of pAgl 

To generate pAgl , pBSCaMV35SHyg was digested with Hind\\\IPst\ and 
ligated with Hind\\\IPst\~d\Qes\ed p3300attBZeo. Thus, pAgl contains the 

1 0 pCambia 3300 backbone with DNA conferring resistance to phophinothricin and 
hygromycin under the control of separate CaMV 35S promoters, an attB- 
promoterless zeomycin resistance-encoding DNA recombination cassette and 
unique sites for adding additional markers, e.g., DNA encoding GFP. The attB 
site facilitates the addition of new DNA sequences to plant or animal, e.g., 

1 5 mammalian, artificial chromosomes, including PACs formed as a result of using 
the pAgl vector, or derivatives thereof, in the production of PACs. The attB 
site provides a convenient site for recombinase-mediated insertion of DNAs 
containing a homologous att site. 
2. pAG2 

20 The vector pAg2 (SEQ. ID. NO: 6; see Figure 3) is a derivative of vector 

pAgl formed by adding DNA encoding a green fluorescent protein (GFP), under 
the control of a NOS promoter and flanked at the 3' end by a NOS polyA signal, 
topAgl. pAg2 was constructed as follows (see Figure 4). A DNA fragment 
containing the NOS promoter was obtained by digestion of pGEM-T-NOS, or 

25 pGEMEasyNOS (SEQ. ID. NO: 7), containing the NOS promoter in the cloning 
vector pGEM-T- Easy (Promega Biotech, Madison, Wl, U.S.A.), with Xba\INco\ 
and was ligated to an Xba\INco\ fragment of pCambia 1302 containing DNA 
encoding GFP (without the CaMV 35S promoter) to generate p1 302NOS (SEQ. 
ID. NO: 8) containing GFP-encoding DNA in operable association with the NOS 

30 promoter, Plasmid p1302NOS was digested with Sma\IBstsN\ to yield a 
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f ragment containing the NOS promoter and GFP-encoding DNA. The fragment 
was ligated with F/nel/Ss/WI-digested pAg1 to generate pAg2. Thus, pAg2 
contains DNA from the bar gene confering resistance to phosphinothricin, DNA 
conferring resistance to hygromycin, both resistance-encoding DNAs under the 
5 control of a cauliflower mosaic virus 35S promoter, DNA encoding kanamycin 
resistance, a GFP gene under the control of a NOS promoter and the attB- 
zeomycin resistance-encoding DNA. One of skill in the art will appreciate that 
other fragments can be used to generate the pAg1 and pAg2 derivatives and 
that other heterlogous DNA can be incorporated intopAgl andpAg2 derivatives 

10 using methods well known in the art. 

3. pAglla and pAgllb transformation vectors 

Vectors pAglla and pAgllb were constructed by inserting the following 
DNA fragments into pAg1: DNA encoding /?-glucoronidase, the nopaline 
synthase terminator fragment, the nopaline synthase (NOS) promoter fragment, 

15 a fragment of mouse satellite DNA and an N. iabacum rDNA intergenic spacer 
sequence (IGS). The construction of pAglla and pAgllb was as follows (see 
Figure 5). 

An N. tabacum rDNA intergenic spacer (IGS) sequence (SEQ. ID. NO: 9); 
see also GenBank Accession No. Y08422; see also Borysyuk et a/. (2000) 

20 Nature Biotechnology 73:1303-1306; Borysyuk et al. (1997) Plant MoL 
Biol. 35:655-660; U.S. Patent Nos. 6, 100,092 and 6,355,860) was obtained by 
PCR amplification of tobacco genomic DNA. The IGS can be used as a 
targeting sequence by virtue of its homology to tobacco rDNA genes; the 
sequence is also an amplification promoter sequence in plants. This fragment 

25 was amplified using standard PCR conditions (e.g., as described by Promega 
Biotech, Madison, Wl, U.S.A.) from tobacco genomic DNA using the primers 
shown below: 
NTIGS-FI 

5'- GTG CTA GCC AAT GTT TAA CAA GAT G- 3' (SEQ ID No. 10) and 
30 NTIGS-RI 
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5'-ATG TCT TAA AAA AAA AAA CCC AAG TGA C- 3' (SEQ ID No. 11) 
Following amplification, the fragment was cloned into pGEM-T Easy to give 
pIGS-l. 

A fragment of mouse satellite DNA (Msatl fragment; GenBank Accession 
5 No. V00846; and SEQ ID No. 1 2) was amplified via PCR from pSAT-1 using the 
following primers: 
MSAT-F1 

5'- AAT ACC GCG GAA GCT TGA CCT GGA ATA TCG C -3'(SEQ ID No. 13) 
and 
10 MSAT-Ri 

5'-ATA ACC GCG GAG TCC TTC AGT GTG CA T- 3' (SEQ ID No. 14) 
This amplification added a Sacll and a Hind\\\ site at the 5'end and a Sacll site 
at the 3' end of the PCR fragment. This fragment was then cloned into the 
Sacll site in plGS-1 to give pMIGS-1 , providing a eukaryotic centromere-specific 

15 DNA and a convenient DNA sequence for detection via FISH. 

A functional marker gene containing a NOS-promoter:GUS:NOS 
terminator fusion was then constructed containing the NOS promoter (GenBank 
Accession No. U09365; SEQ ID No. 15) # E. colt ^-glucuronidase coding 
sequence (from the GUS gene; GenBank Accession No. S69414; and SEQ ID 

20 No. 16), and the nopaline synthase terminator sequence (GenBank Accession 
No. U09365; SEQ ID No. 18). The NOS promoter in pGEM-T-NOS was added 
to a promoterless GUS gene in pBlueScript (Stratagene, La Jolla, CA, U.S.A.) 
using Not\ISpe\ to form pNGN-1, which has the NOS promoter in the opposite 
orientation relative to the GUS gene. 

25 pMIGS-1 was digested with Not\ISpe\ to yield a fragment containing the 

mouse major satellite DNA and the tobacco IGS which was then added to Not\- 
digested pNGN-1 to yield pNGN-2. The NOS promoter was then re-oriented to 
provide a functional GUS gene, yielding pNGN-3, by digestion and religation 
with Spe\. Plasmid pNGN-3 was then digested with Hind\\\, and the Hind\\\ 

30 fragment containing the /^-glucuronidase coding sequence and the rDNA 
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intergenic spacer, along with the Msat sequence, was added to pAG-1 to form 
pAglla, using the unique Hind\\\ site in pAg1 located near the right T-DNA 
border of pAg1, within the T-DNA region. 

Another plasmid vector, referred to as pAgllb, was also recovered, which 
5 contained the inserted Hind\\\ fragment in the opposite orientation relative to 
that observed in pAglla. Thus, pAglla and pAgllb differ only in the orientation 
of the Hind\\\ fragment containing the mouse major satellite sequence, the GUS 
DNA sequence and the IGS sequence (see Figure 6). The nucleotide sequence 
of pAglla is provided in SEQ. ID. NO: 21. 

10 Vectors pAg1, pAg2, pAglla and pAgllb, as well as similarly designed 

vectors containing a recombination site and a promoter (e.g., plant or animal 
promoter), and possibly other regulatory sequences, inoperable association with 
DNA encoding a protein or other product for the expression in a host cell, such 
as a plant or animal cell, can be used in the transfer of any protein (or other 

15 product)-encoding nucleic acid of interest into a cell for expression thereof . For 
example, any protein (or other product)-encoding nucleic acid of interest (in 
operable association with transcriptional regulatory suitable for use in a 
particular host cell) can be inserted into any of the vectors pAg1 , pAg2, pAglla 
and pAgllb and thereby incorporated into a plant, animal or other artificial 

20 chromosome, particularly a platform artificial chromosome ACes, as desribed 
herein. 

Example 6 

Agrobacterium-Mediated Transformation of Plant Cells 

Plant cells were transformed via Agrobacterium-med'iated transformation 
25 according to standard procedures (see, for example, Horsch etaL (1988) Plant 
Molecular Biology Manual, A5A-Q, Kluwer Academic Publisher, Dordrecht, 
Belgium). Brief ly , Agrobacterium strain GV 3101/pMP90 (see Koncz and Schell 
(1986) Molecular and General Genetics 204:383-396) was transformed with 
pAglla and pAgllb (see Example 5) by heat shock, and the plasmid integrity of 
30 pAglla and pAgllb after transformation was verified by Hind\\\ digest pattern. 
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pAg!la/pMP90 or pAgllb/pMP90 were cultured in 5 ml AB minimum medium 
(Horsch etal. { 1 988) Plant Molecular Biology Manual, A5\ 1 -9, Kluwer Academic 
Publisher, Dordrecht, Belgium) containing 25 //g/ml kanamycin and 25 //g/ml 
gentamycin at 28°C for two days. 
5 Leaf disks of tobacco and Arabidopsis and root segments of Arabidopsis 

were prepared as follows: tobacco leaves from 3 to 4 week-old explants were 
cut into 1 cm in diameter, and Arabidopsis leaves were taken from 3 week-old 
seedlings and transversely cut in two halves. Roots of 3 week-old Arabidopsis 
were excised into segments of 1 cm in length. Cocultivation was carried out 

10 by immersing leaf disks or root segments in bacterial culture for 2 minutes and 
then transferring the infected tissues to culture medium without antibiotics for 
2 days at 22°C for 16-hours/day under cool white fluorescent light. The leaf 
disks of tobacco and Arabidopsis were cultured on MS104 medium (MS, 3% 
sucrose, 0.05% MES, 1.0 mg/l BA, 0.1 mg/l NAA and 0.8% agar, pH 5.8) and 

15 root segments on callus-inducing medium, CIM 0.5/0.05 (B5, 2% glucose, 
0.05% MES, 0.5 mg/l 2,4-D, 0.05 mg/l kinetin and 0.8% agar, pH 5.8). 

The transformed leaf disks and root segments were then transferred to 
selection medium of MS 104 or CIM 0.5/0.05, respectively, containing 20 mg/l 
hygromycin and 300 mg/l Timentin for the elimination of Agrobacterium. The 

20 selection medium was refreshed every two weeks and green shoots 
regenerated. Plants were analyzed for the expression of the DNA encoding GUS 
by standard histochemical and fluorescent assays and evidence of amplification 
of the inserted DNA by quantitative PCR. Numerous plants were obtained that 
expressed high levels of GUS, and multiple copies of the GUS gene were 

25 observed by Fluorescent In Situ Hybridization (FISH) and PCR analysis. Thus, 
amplification the chromosomal regions containing the inserted DNA was 
observed. One of skill in the art will appreciate that GUS expression, or the 
expression of any other gene, can be assessed using methods well known in the 
art. 

30 Example 7 
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Transfection and culture of Arabidopsis protoplasts 

E. coli strain Stb14 (Gibco Life Sciences) was transformed with pAglla, 
pAgllb, and one of two targeting plasmids containing the rDNA repeat sequence 
from Arabidopsis (plasmid pJHD-14A or the 26S rDNA from Arabidopsis plasmid 
5 pJHD2-19A, as described by Doelling et al. [(1993) Proc. Natl. Acad. Sci. 
U.S.A. 90:7528-7532]) via electroporation according to standard procedures. 
A single colony was grown up in 250 ml LB medium containing 50 //g/ml 
kanamycin (for selection based on the kanamycin resistance-encoding DNA in 
pAglla and pAgllb) or 50 //g/ml ampicillin (for selection based on the ampicillin 

10 resistance-encoding DNA in pJHD-14A & pJHD2-19A) and cultured at 30°C 
with shaking at 225 rpm for 16 hours. The plasmids were isolated according to 
standard procedures well known in the art. The structural integrity of the 
plasmids was checked by restriction digestion pattern, and the plasmids were 
linearized with restriction enzymes. Plasmids were sterilized with chloroform 

15 and 70% ethanol before use for transfection. 

Arabidopsis protoplasts were resuspended in the culture medium (see 
Example 1) at a density of 2 x 10 6 protoplasts/ml. A 300 p\ protoplast 
suspension was pipetted into a 1 5 ml tube, and 30 p\ of plasmid (pAglla or 
pAgllb) and targeting DNA (pJHD-14A or pJHD2-19A) was added containing 

20 10 pg plasmid and 100//g targeting sequence followed immediately by slowly 
adding 300 p\ of 10% PEG. The targeting plasmids were included in the 
transfection procedure in order ensure that the amount of rDNA targeting DNA 
(i.e., tobacco rDNA from pAglla or b and Arabidopsis DNA from the targeting 
vectors) was sufficient to effect recombination of the introduced DNA at a 

25 homologous site in an Arabidopsis chromosome. DNA was typically used in a 
ratio of 10:1, targeting DNA (pJHD-14A or pJDH2-19A, or Lambda DNA) to 
plasmid DNA (pAglla or pAgllb, or a selectable marker plasmid), or in a ratio of 
5:1 . Generally, the number of base pairs of targeting DNA to be sufficient for 
insertion into a plant chromosome is at least about 50 bp, or about 60 bp, or 

30 about 70 bp, or about 80 bp, or about 90 bp, or about 100 bp, or about 1 50 
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bp, or about 200 bp, or about 300 bp f or about 400 bp, or about 500 bp, or 
about 600 bp, or about 700 bp, or about 800 bp, or about 900 bp, or about 1 
kb, or about 2 kb or about 3 kb, or about 4 kb, or about 5 kb, or about 6 kb, 
or about 7 kb, or about 8 kb, or about 9 kb, or about 10 kb or more. The 
5 amount and length of targeting DNA sufficient to effect introduction into a 
chromosome can be determined empirically and can vary for different plant 
species. 

The mixture was shaken gently, and immediately 300 pi of 10% PEG 
solution was added slowly with gentle shaking. The protoplast mixture was 

10 incubated at 22°C for 10-1 5 min with several cycles of gentle shaking. DNA 
uptake was quenched by the addition of 5 ml 72.4 g/l Ca(N0 3 ) 2 . The 
protoplasts were then centrifuged at 80xg for 7 min and resuspended in culture 
medium. For selection, 10 to 40 mg/l hygromycin was added to protoplast 
cultures 1 4 days aftertransf ection, and the culture medium was refreshed every 

15 7 days. The protoplast cultures could also be selected after embedding in 0.6% 
agarose by transferring to a culture medium containing 20 mg/l hygromycin. The 
cultures were incubated for 14 days or longer at 22° C. 

The Arabidopsis protoplasts were analyzed for the presence and 
expression of the DNA encoding GUS. Recovered microcalli strongly expressed 

20 GUS and were resistant to selective agents, indicating amplification of the 
inserted DNA. Alternatively, the transfection of Arabidopsis protoplasts can 
be conducted without using targeting DNA sequences since pAglla and pAgllb 
include a region of rDNA (i.e. the tobacco rDNA IGS) that can act as a targeting 
sequence as long as a sufficient amount of pAglla/b plasmid is used in the 

25 transfection procedure. Example 8 

Transfection and Culture of Tobacco Protoplasts 
As described in Example 7, E. coli strain Stbl4 was transformed with pAglla, 
pAgllb, pJHD-14A (targeting DNA) and pJHD2-19A (targeting DNA) via 
electroporation, and plasmid DNA was recovered and linearized with restriction 
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enzymes. Plasmids were sterilized with chloroform and 70% ethanol before use 
for transfection. 

The tobacco protoplasts (see Examples 2 and 3) were resuspended in the 
culture medium (see Example 2) at a density of 2 x 10 6 protoplasts/ml. A 300 
5 p\ protoplast suspension was pipetted into a 15 ml tube, and 30 //I of plasmid 
and targeting DNA was added as described in Example 7. The mixture was 
shaken gently, and immediately 300 //I of 10% PEG solution was added slowly 
with gentle shaking. The tobacco protoplast mixture was incubated at 22 °C 
for 10-15 min with several cycles of gentle shaking. DNA uptake was 

10 quenched by the addition of 5 ml 72.4 g/L Ca(N0 3 ) 2 . The protoplasts were then 
centrifuged at 80xg for 7 min and resuspended in culture medium. 

The recovery of viable tobacco protoplasts following DNA uptake ranged 
from 65-75% following treatment. Typically greater than 35% of the 
protoplasts initiated cell division within 7 days of treatment. Protoplast cells 

15 were analyzed for gene expression (in this case for the expression of the 
reporter DNA GUS, but alternatively, the expression of other genes can be 
monitored). Between 4% and 6% of the recovered cells exhibited GUS 
expression. 

The protoplasts were subject to selection procedures to recover 
20 transformed cells. For selection of tobacco cells, 10 to 40 mg/l hygromycin 
was added to protoplast cultures 10-14 days after transfection, and the culture 
medium was refreshed every 7 days. Leaf disc selection was performed in the 
presence of 40 mg/l hygromycin. Transformed microcalli were recovered and 
analyzed for the expression of the GUS reporter gene. GUS positive calli were 
25 isolated and subjected to FISH analysis (see Example 13). Plant cells that 
exhibited amplification of the inserted DNA were identified. 

Example 9 

Transfection and Culture of Brassica Protoplasts 

Brassica protoplasts (see Example 4), following the final washing step 
30 after filtering through a 63 nylon screen and centrif ugation, are collected 
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and used for DNA transfection as described in Example 8. Brass/ca protoplast 
cultures following DNA uptake or transformation by Agrobacterium can be 
selected with either hygromycin or glufosinate ammonium in liquid culture or in 
embedded semi-solid cultures. The effective concentration of hygromycin is 1 0 
5 to 40 mg/l for 2 to 4 weeks or continuously, whereas that for glufosinate 
ammonium is 2 to 60 mg/l for 5 days to 2 weeks. Selection can impede growth, 
and additional transfers to similar media may be required. 

Example 10 
Plant Regeneration from Brassica Protoplasts 

10 Colonies of Brass/ca protoplasts (1 mm or larger in diameter) are plated 

onto regeneration medium (basal Murashige and Skoog's medium, 1 % sucrose, 
2 mg/l BA, 0.01 mg/l NAA, 0.8% agarose, pH 5.6). Cultures are incubated 
under the conditions described in Example 4. Cultures are transferred onto 
fresh regeneration medium every 2 weeks. Regenerated shoots are transferred 

15 onto autoclaved rooting medium (basal Murashige and Skoog's medium, 1% 
sucrose, 0.1 mg/l NAA, 0.8% agar, pH 5.8) and incubated under dim 
fluorescent light {25 //Em" 2 s 1 ). Plantlets are potted in a soil-less mix (for 
example, Terra-lite Redi-Earth, W.R. Grace & Co., Canada Ltd., Ajax, Ontario) 
containing fertilizer (Nutricote 1414-14 type 100, Plant Products Co. Ltd, 

20 Brampton, Ontario) and grown in a growth room (20°C/15°C, 16 h 
photoperiod, 100-1 40 //Em 2 s' 1 ) with fluorescent and incandescent light at soil 
level. Plantlets are covered with transparent plastic cups for one week to allow 
for acclimatization. 

Example 11 

25 Isolation of Nuclei from Protoplasts 

To facilitate analysis, plant cells can be subjected to nuclei isolation, and 
the isolated nuclei can be analyzed by FISH or PCR. To isolate the nuclei, 
protoplast calli were reprotoplasted according to the procedure of Mathur etal. 
with modifications (see Mathur et at. Plant Cell Report (1995) 14: 221-226). 
30 The protoplast calli were digested with 1.2% Cellulase 'Onozuka' R-10 and 
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0.4% w/v Macerozyme R-10 in nuclei isolation buffer {10 mM MES-pH 5.5, 
0.2M sucrose, 2.5 mM EDTA, 2.5 mM DTT, 0.1 mM spermine, 10 mM NaCI, 
10 mM KCI and 0.15% Triton X-100) for 3 hours. After centrifugation at 80 
x g for 10 minutes, the pellets of protoplasts were resuspended in hypertonic 
5 buffer of 1 2.5% W5 solution {Hinnisdaels etal. (1 994) Plant Molecular Biology 
Manual G2:1-13, Kluwer Academic Publisher, Belgium) for 10 minutes. To 
promote disruption of protoplasts, the protoplast suspension was forced through 
a syringe needle four times. The disrupted protoplasts were filtered through 5 
//m meshes to remove debris and centrifuged at 200 x g for 10 min. By 

10 repeated washing of the pellet in a nuclei isolation buffer containing 
phenylmethylsulfonylfluoride (PMSF) and centrifugation at 200 x g for 10 
minutes, nuclei were collected as a white pellet freed from cytoplasm 
contamination and cellular debris. Samples were fixed in 3:1 methanohglacial 
acetic acid and were analyzed by FISH. 

15 Example 12 

Mitotic Arrest of Plant Cells for Detection of Amplification and 
Artificial Chromosome Formation 

In general, plant cells or protoplasts are typically cultured for two or more 

generations prior to mitotic arrest. Typically, 5/yg/ml colchicine is added to the 

20 cultures for 1 2 hours to accumulate mitotic plant cells. The mitotic cells are 
harvested by gentle centrifugation. Alternatively, plant cells (grown on plastic 
or in suspension) can be arrested in different stages of the cell cycle with 
chemical agents other than colchicine, such as, but not limited to, hydroxyurea, 
vinblastine, colcemid or aphidicolin or through the deprivation of nutrients, 

25 hormones, or growth factors. Chemical agents that arrest the cells in stages 
other than mitosis, such as, but not limited to, hydroxyurea and aphidicolin, are 
used to synchronize the cycles of all cells in the population and are then 
removed from the cell medium to allow the cells to proceed, more or less 
simultaneously, to mitosis at which time they can be harvested to disperse the 

30 chromosomes. 
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Example 13 

Detection of Amplification and Artificial Chromosome Formation by 
Fluorescence in situ hybridization (FISH) 

A variety of plant cells can analyzed by fluorescence in situ hybridization 

5 <FISH) methods (Fransz et al. (1996) Plant J. 9:421-430; Fransz eta/. (1998) 

Plant J. 73:867-876; Wilkes et al. (1995) Chromosome Research 3:466-472; 

Busch et al. (1994) Chromosome Research 2:1 5-20; Nkongolo (1 993) Genome 

35:701-705; Leitch et al. (1994) Methods In Molecular Biology 23:177-185; 

Murata et aL. (1997) Plant J. 72:31-37) to identify amplification events and 

10 artificial chromosome formation. 

FISH is used to detect specific DNA sequences on chromosomes, in 
particular to detect regions of plant chromosomes that have undergone 
amplification as a result of the introduction of heterologous DNA as described 
herein, or to detect artificial chromosome formation in plant cells. FISH 

15 chromosome spreads of Arabidopsis and tobacco plant cells into which 
heterologous DNA has been introduced are generated using colchicine or similar 
cell cycle arresting agents and various DNA probes (e.g. rDNA probe. Lambda 
DNA probe, selectable marker probe). The cells are analyzed for the presence 
of amplified regions of chromosomes, in particular amplification of the rDNA 

20 regions, and those cells exhibiting amplification are further cultured and 
analyzed for the formation of artificial chromosomes. 

The chromosomes of plant cells subjected to introduction of heterologous 
DNA and growth to generate artificial chromosomes can also be analyzed by 
scanning electron microscopy. Preparation of mitotic chromosomes for 

25 scanning electron microscopy can be performed using methods known in the 
art (see, e.g., Sumner (1991) Chromosome 700:410-418). The chromosomes 
can be observed, for example, with a Hitachi S-800 field emission scanning 
electron microscope operated with an accelerating voltage of 25kV. 



WO 2002/096923 



PCTYUS2002/017451 



-188- 
Example 14 

Detection of Amplification and Artificial Chromosome Formation by 
Idu Labeling of Chromosomes 

The structure of the chromosomes in plant cells can be analyzed by labeling 

5 the chromosomes with iododeoxyuridine (IdU), or other nucleotide analog, and 

using an Idll-specific antibody to visualize the chromosome structure. Plant cell 

cultures selected following introduction of heterologous DNA are labeled with 

IdU following standard protocols (Fujishige and Taniguchi (1998) Chromosome 

Research 5:61 1-619; Yanpaisan etai. (1998) Biotechnology and Bioengineering, 

10 55:51 5-528; Trick and Bates (1 996) Plant Cell Reports, 75:986-990; Binarova 
et al. (1993) Theoretical and Applied Genetics, 57:9-16; Wang et al. (1991) 
Journal of Plant Physiology, 735:200-203). Plant cells in culture, typically 
suspension culture, are used. A series of sub-cultures are initiated, and IdU 
labeling is performed as described above. Cells are allowed to incorporate IdU 

15 for up to a week, depending on the doubling time of the culture. Labeled 
chromosomes can be detected in plant cells (Fujishige and Taniguchi (1998) 
Chromosome Research 5:611-619; Binarova et al. (1993) Theoretical and 
Applied Genetics 57:9-16) and in mammalian cells (Gratzner and Leif (1981) 
Cytometry 7:385-393) using procedures well known in the art. IdU-labeled 

20 chromosomes are detected by immunocytochemical techniques. An anti-ldU 
fluorescein isothiocyanate (FITC)-conjugated B44 clone antibody (Becton 
Dickinson) is used to bind the IdU-DNA adduct in the DNA and is detected by 
fluorescence microscopy (490 nm excitation, 519 nm emission). Analysis of 
labeled chromosomes reveals the presence of amplified DNA regions and the 

25 formation of artificial chromosomes. 

Example 15 

Isolation of Metaphase Chromosomes from Protoplasts 

Artificial chromosomes, once detected in plant cells, may be isolated for 
transfer to other organisms and in particular other plant species. Several 
30 procedures may be used to isolate metaphase chromosomes from mitotic- 



WO 2002/096923 



PCT/US2002/017451 



-189- 

arrested plant cells, including, but not limited to, a polyamine-based buffer 
system (Cram eta/. (1990) Methods in Celi Bioiogy 33:377-3821), a modified 
hexylene glycol buffer system (Hadlaczky et aL (1982) Chromosoma 
56:643-65), a magnesium sulfate buffer system (Van den Engh et af. (1988) 
5 Cytometry 5:266-270 and Van den Engh era/. (1984) Cytometry 5:108), an 
acetic acid fixation buffer system (Stoehr et aL (1982) Histochemistry 
74:57-61), and a technique utilizing hypotonic KCI and propidium iodide (Cram 
etaL (1994) XVII meeting of the International Society for Analytical Cytology, 
October 1 6-21 , Tutorial IV Chromosome Analysis and Sorting with Commerical 

10 Flow Cytometers; Cram etaL (1990) Methods in Cell Biology 33:376; de Jong 
etaL (1999) Cytometry 35:129-133). 

In an exemplary procedure, a hexylene glycol buffer is used to isolate plant 
chromosomes from mitotic-arrested plant cells that have been converted to 
protoplasts (Hadlaczky et al. (1 982) Chromosoma 36:643-659). Chromosomes 

15 are isolated from about 10 e mitotic cells re-suspended in a glycine-hexylene 
glycol buffer (100 mM glycine, 1 % hexylene glycol, pH 8.4-8.6, adjusted with 
a solution of saturated Ca(OH) 2 ) supplemented with 0.1 % Triton X-100 (GHT 
buffer). The cells are incubated for 1 0 minutes at 37°C, and the chromosomes 
are purified by differential centrif ugation to pellet the nuclei (200xg for 20 min) 

20 and sucrose gradient centrifugation (5-30% sucrose, 5600xg for 60 min, 
0-4°C). To avoid proteolytic degradation of chromosomal proteins, 1 mMPMSF 
(phenylmethylsulfonylfluoride) is used in the presence of 1 % isopropyl alcohol. 
The proteins can be extracted from the isolated chromosomes using dextran 
sulfate-heparin (DSH) extraction, and the chromosomes can be visualized via 

25 electron microscopy using techniques known in the art (Hadlaczky etal, (1 982) 
Chromosoma (BerlJ 36:643-659; Hadlaczky etaL (1981) Chromosoma (BerL) 
37:537-555). Additionally, modifications of these procedures, including, but 
not limited to, modification of the buffer composition (Carrano etaL (1979) 
Proc. NatL Acad. Sci. U.S.A. 76: 1 382-1 384) and variation of the centrifugation 
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time or speed, to accommodate different plant species can be implemented by 
any skilled artisan. 

Example 16 

Transfer of Artificial Chromosomes into Plant Cells: Transfer of 
5 Mammalian Artificial Chromosomes into a Dicot Plant: Arabidopsis 

One method of delivery of mammalian artificial chromosomes (MACs) into 

plant cells is the formation of microcells containing murine MACs and the 

CaP0 4 -mediated uptake or the PEG-mediated fusion of these microcells with 

plant protoplasts. In this example, microcells and plant protoplasts, such as but 

10 not limited to tobacco and Arabidopsis protoplasts, were mixed (in a series of 
25:1, 10:1, 5:1, or 2:1 microcells:protoplasts ratio) and fusion was observed. 
Protocols for the formation of microcells are known in the art and are described, 
for example, in US Patent Nos. 5,240,840, 4,806,476 and 5,298,429 and in 
Fournier Proc. Natl. Acad. Sci. U.S.A. (1981) 75:6349-6353 and Lambert et al. 

15 Proc. Natl. Acad. Sci. U.S.A. (1991) 88: 5907-5912. The murine microcells 
can be labeled with Idu or the IVIACs stained with a specific dye such as, but 
not limited to, e.g., propidium iodide or DAPI, prior to fusion with plant 
protoplasts including, but not limited to, Arabidopsis and tobacco protoplasts, 
to facilitate detection of the presence of IVIACs in the protoplasts. 

20 In this example, MACs were introduced into Arabidopsis cells using 

microcell-PEG mediated fusion. Microcells were, formed from murine cells 
containing an artificial chromosome (see U.S. Patent No. 6,077,697) and were 
fused with freshly prepared Arabidopsis protoplasts in a ratio of 10:1, 
microcells to protoplasts. Fusion occurred in the presence of 25% PEG 6000, 

25 204 mM CaCI 2 , pH 6.9 within the first 5 minutes of mixing. Typically less than 
about one minute of mixing is required to observe fusion between microcells 
and protoplasts. Fused cells were washed with 240 mM CaCI 2 , then floated on 
top of a solution of 204mM sucrose in B5 salts. Cells were then transferred to 
cell suspension culture media (MS, 87mM sucrose, 2.7 phA napthalene acetic 

30 acid, 0.23 /iM kinetin, pH 5.8). Empirical observations can be used to 
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determine the optimal concentration and composition of PEG and the 
concentration of calcium that provides the highest degree of fusion with the 
least toxicity. 

Fused protoplasts were allowed to grow for one or more generations. 
5 The presence of a mouse chromosomal sequence, including MACs, was 
demonstrated by southern hybridization with MAC probes, by FISH analysis and 
by PCR analysis using, for example, satellite sequences known to exist on the 
MAC chromosome. Thus, the mouse sequences were detected in the 
Arabidopsis protoplasts. 

10 To further demonstrate the transfer of mouse chromosomal sequence to 

Arabidopsis protoplasts, Arabidopsis plant cell nuclei were isolated according 
to Example 1 1 and were subjected to FISH analysis according to Example 13, 
using the mouse major satellite DNA (SEQ ID No. 12). A portion of the nuclei 
contained a significant signal using the mouse major satellite DNA, indicating 

15 successful transfer of at least a mouse chromosome and/or MAC to the 
Arabidopsis nuclei. 

Similarly, PACs may be introduced into Arabidopsis protoplasts using 
PEG- and/or calcium-mediated fusion procedures. Generation of 
microprotoplasts and protoplasts can be conducted as described, for example, 

20 in Example 1. Microprotoplasts formed from plant cells containing a plant 
artificial chromosome are fused with freshly prepared Arabidopsis protoplasts, 
for example, in a ratio of 10:1, microprotoplasts to protoplasts. Protoplasts 
from other plants, including but not limited to, tobacco, wheat, maize and rice, 
can also be used as the recipient of MACs and/or PACs. Fused protoplasts are 

25 recovered and allowed to grow for one or more generations. The presence of 
the transferred PACs can be analyzed using methods such as, for example, 
those described herein (including Southern hybridization withPAC probes, FISH 
analysis and PCR analysis using DNA sequences specific to the PAC). 
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Example 17 

Transfer of Artificial Chromosomes into Plant Cells: Transfer of 
Mammalian Artificial Chromosomes into a Second Dicot Plant: Tobacco 

MACs were introduced into tobacco cells using microcell-PEG mediated 

5 fusion using the same microcells, MAC, and protocol as described in Example 

16. Microcells were formed from murine cells containing an artificial 

chromosome and were fused with freshly prepared tobacco BY-2 protoplasts in 

a ratio of 10:1, microcells to protoplasts. Fusion occurred in the presence of 

20% PEG 4000 and 1 00-200 mM calcium chloride. Empirical observations are 

10 used to determine the optimal concentration and composition of PEG and the 

concentration of calcium that provides the highest degree of fusion with the 

least toxicity. 

DAPI staining of the microcells (e.g. by preincubation of the microcells 
with DAPI by adding DAPI to the microcells to a final concentration of 1 //g/ml) 

15 allowed visualization of the fusion and transfer of the chromosomes to the 
tobacco protoplasts. Fused protoplasts were recovered and allowed to grow for 
one or more generations. The fused protoplasts can be analyzed for the 
presence of a MAC in a number of ways, including those described herein. 
Fused tobacco cell nuclei were isolated from tobacco protoplasts that had been 

20 fused with microcells according to Example 1 1 and were subjected to FISH 
analysis according to Example 13, using the mouse major satellite DNA (SEQ 
ID No. 12). Numerous nuclei were found to have incorporated a mouse 
chromosome. 

Example 18 

25 Transfer of isolated Artificial Chromosomes by Lipid-Mediated Transfer 

into a Monocot Plant: Rice 

Isolated murine artificial chromosomes (MACs) prepared by sorting 

through a FACS apparatus (de Jong etal. Cytometry (1 999) 35:129-133) were 

transferred into rice plant protoplasts by cationic lipid-mediated transfection of 

30 the purified MAC. Purified MACs (see Example 15 and U.S. Patent No. 

6,077,697) were mixed with Lipof ectAMINE 2000 (Gibco, Md, USA) as follows. 
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Typically, 15 y\ of LipofectAMINE 2000 were added to 1 X 10 6 artificial 
chromosomes in liquid buffer, the solution allowed to complex for up to three 
hours, and then the solution was added to freshly prepared 1 X 10 5 rice 
protoplasts prepared using standard protoplast methods well known in the art. 
5 The uptake of the lipid-complexed artificial chromosome was monitored by 
adding to the mixture of protoplasts and purified artificial chromosomes a 
fluorescent dye that stains DNA. Microscopic examination of the 
protoplast/artificial chromosome mixture over the next several hours allowed the 
visualization of the artificial chromosome being transported across the 

10 protoplast cellular membrane and the presence of the readily identifiable MAC 
in the cytoplasm of the rice plant cell. 

The same procedure as described in this Example for cationic lipid- 
mediated transfer of an isolated MAC into rice protoplasts can be used to 
transfer isolated MACs, as well as PACs, into rice and other plant protoplasts, 

15 including but not limited to, tobacco, wheat, maize and Arabidopsis. Fused 
protoplasts are recovered and allowed to grow for one or more generations. 
The presence of the transferred MACs and PACs can be analyzed using 
methods such as, for example, those described herein (including, but not limited 
to, Southern hybridization with PAC probes, FISH analysis and PCR analysis 

20 using DNA sequences specific to the PAC). 

Example 19 

Delivery of Plant Regulatory and Coding Sequences via a Promoterless attBZeo 
Marker Gene in p'Ag2 onto a MAC Platform 

As described in Examples 6-15, the plasmid pAg2, comprising plant 

25 regulatory and selectable marker genes (SEQ ID NO: 6; prepared as set forth in 

Example 5) can be used for the production of a MAC containing said plant 

expressible genes. In this example, pAg2, by virtue of the attBZeo DNA 

sequences contained on the plasmid, is used for the loading of plant regulatory 

and selectable marker genes onto MACs in mammalian cells using the attB 

30 sequences to recombine with attP sequences present on a platform MAC. In 
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this example, platform M ACs are produced with attP sequences and the plasmid 
pAg2 is then loaded onto the platform MAC. New MACs so produced are 
useful for introduction into plan cells by virtue of the plant expressible markers 
contained therein. 

5 A. Construction of Platform MAC containing pSV40attPsensePUR (Figure 
7; SEQ ID NO: 26). 

An example of a selectable marker system for the creation of a MAC- 

based platform into which the plasmid pAg2 can target plant regulatory and 

coding sequences is shown in Figure 7. This system includes a vector 

1 0 containing the SV40 early promoter immediately followed by ( 1 ) a 282 base pair 

(bp) sequence containing the bacteriophage lambda attP site and (2) the 

puromycin resistance marker. Initially a Pvu\\IStu\ fragment containing the 

SV40 early promoter from plasmid pPUR (Clontech Laboratories, Inc., Palo Alto, 

CA; SEQ ID No. 22) was subcloned into the EcoR\ICR\ site of pNEB193 (a 

1 5 PUC1 9 derivative obtained from New England Biolabs, Beverly, MA; SEQ ID No. 

23) generating the plasmid pSV40193. 

The attP site was PCR amplified from lambda genome (GenBank 

Accession # NC 001416) using the following primers: 

attPUP: CCTTGCGCTAATGCTCTGTTACAGG SEQ ID No. 24 

20 attPDWN: CAGAGGCAGGGAGTGGGACAAAATTG SEQ ID No. 25 

After amplification and purification of the resulting fragment, the attP site 

was cloned into the Sma\ site of pSV401 93 and the orientation of the attP site 

was determined by DNA sequence analysis (plasmid pSV40193attP). The gene 

encoding puromycin resistance (Puro) was isolated by digesting the plasmid 

25 pPUR (Clontech Laboratories, Inc. Palo Alto, CA) with Age\IBamH\ followed by 

filling in the overhangs with Klenow and subsequently cloned into the Asc\ site 

downstream of the attP site of pSV40193attP generating the plasmid 

pSV40193attPsensePUR (Figure 7; SEQ ID NO:26)). 

The plasmid pSV401 93attPsensePUR was digested with Seal and co- 

30 transfected with the plasmid pFK161 into mouse LMtk- cells and platform 
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artif icial chromosomes were identified and isolated as described herein. Briefly, 
Puromycin resistant colonies were isolated and subsequently tested for artificial 
chromosome formation via fluorescent in situ hybridization (FISH) (using mouse 
major and minor DNA repeat sequences, the puromycin gene and telomeres 
5 sequences as probes), and their fluorescent activating cell sorted (FACS). From 
this sort, a subclone was isolated containing an artificial chromosome, 
designated B19-38. FISH analysis of the B19-38 subclone demonstrated the 
presence of telomeres and mouse minor on the MAC. DOT PCR has been done 
revealing the absence of uncharacterized euchromatic regions on the MAC. The 

10 process for generating this exemplary MAC platform containing multiple site- 
specific recombination sites is summarized in Figure 5. This MAC chromosome 
may subsequently be engineered to contain target gene expression nucleic acids 
using the lambda integrase mediated site-specific recombination system as 
described below. 

15 B. Construction of Targeting Vector. 

The construction of the targeting vector pAg2 is set forth in Example 5 

herein. 

C. Transfection of Promotorless Marker and Selection With Drug (See 
Figure 9). 

20 The mouse LMtk- cell line containing the MAC B19-38 (constructed as 

set forth above and also referred to as a 2 nd generation platform ACE), is plated 
onto four 10cm dishes at approximately 5 million cells per dish. The cells are 
incubated overnight in DMEM with 10% fetal calf serum at 37°C and 5% C0 2 . 
The following day the cells are transfected with 5//g of the vector pAg2 

25 (prepared as described in Example 5 above) and 5//g of pCXLamlntR (encoding 
a lambda integrase having an E to R amino acid substitution at position 174), 
for a total of 10//g per 10cm dish. Lipofectamine Plus reagent is used to 
transfect the cells according to the manufacturers protocol. Two days post- 
transfection zeocin is added to the medium at 500ug/ml. The cells are 
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maintained in selective medium until colonies are formed. The colonies are then 

ring-cloned and genomic DNA is analyzed. 

D. Analysis Of Clones (PCR, SEQUENCING). 

Genomic DNA (including MACs) is isolated from each of the candidate 
5 clones with the Wizard kit (Promega) and following the manufacturers protocol. 

The following primer set is used to analyze the genomic DNA isolated from the 

zeocin resistant clones: 5PacSV40 - CTGTTAATTAACTGTGGAATGTGTG 

TCAGTTAGGGTG (SEQ ID NO: 28); Antisense Zeo - 

TGAACAGGGTCACGTCGTCC (SEQ ID NO: 29). PCR amplification using the 
10 above primers and genomic DNA # which included MACs, from the candidate 

clones results in a PCR product indicating the correct sequence for the desired 

site-specific integration event. 

The MACs containing the pAg2 vector are identified and used for transfer 

into plant (such as described in Examples 16 and 17) or animal cells for the 
15 expression of the desired coding sequences contained therein. The MACs 

containing pAg2 carry two plan selectable markers (hygromycin resistance, 

resistance to phosphinothricin) and a visual selectable marker (green fluorescent 

protein). 

Example 20 

20 Construction of Plant-derived Shuttle Artificial Chromosome. 

In another embodiment, the plant artificial chromosomes provided herein 
are useful as selectable shuttle vectors that are able to move one or more 
desired genes back and forth between plant and mammalian cells. In this 
particular embodiment, the plant artificial chromosome is bi-functional in that 
25 proper integration of donor nucleic acid can be selected for in both plant and 
mammalian cells. 

For example, a plant artificial chromosome is prepared as described in 
Examples 6-15 above using ing the plasmid pAg2 (Example 5; SEQ ID NO: 6) 
that has been modified to include the SV40attPsensePur coding region from the 
30 plasmid pSV401 93attPsensePur (described above in Example 1 9. A.). Thus, the 
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resulting plant-derived shuttle artificial chromosome contains DNA from the bar 
gene confering resistance to phosphinothricin in plant cells, DNA from the 
hygromycin resistance gene conferring resistance to hygromycin in plant cells, 
both resistance-encoding DNAs under the control of a separate cauliflower 
5 mosaic virus (CaMV) 35S promoter, the attB-promoterless zeomycin resistance- 
encoding DNA, and DNA conferring resistance to puromycin under the control 
of a mammalian SV40 promoter. Accordingly, the presence of the shuttle PAC 
in either a plant or mammalian cell can be selected for by treatment with, for 
example, either hygromycin (plant) or puromycin (mammalian). 

10 Because the resulting plant-derived shuttle artificial chromosome contains 

at least one SV40attP site therein similar to the platform MAC prepared in 
Example 19. A. above, a donor vector containing an attB-selectable marker 
sequence, such as a plasmid comprising an attBzeo (e.g. pAg2) can be used to 
selectively introduce desired heterologous nucleic acids from any species (such 

15 as plants, animals, insects and the like) into the shuttle artificial chromosome 
that is present in a mammalian cell. 

Likewise, a plant promoter region, such as CaMV35S, can be used to 
replace the SV40 promoter in the SV40attPPur region of the modified pAg2 
plasmid described above. In this embodiment, because the resulting plant- 

20 derived shuttle artificial chromosome contains at least one CaMV35SattP site 
therein analogous to the platform MAC prepared in Example 19. A. above, a 
donor vector containing an attB-selectable marker sequence, such as a plasmid 
having attBkanamycin, or other plant selectable or scorable marker can- be used 
to selectively introduce desired heterologous nucleic acids from any species 

25 (such as plants, animals, insects and the like) into the shuttle artificial 
chromosome that is present in a plant cell. 

Since modifications will be apparent to those of skill in this art, it is 
intended that this invention be limited by only the scope of the appended 
claims. 
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What is Claimed: 

1. A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a cell comprising one or more plant 

chromosomes; and 

5 selecting a cell comprising an artificial chromosome that comprises 

one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
10 sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 

2. The method of claim 1, wherein the artificial chromosome is 
predominantly made up of one or more repeat regions. 

15 3. The method of claim 1, wherein the nucleic acid introduced into 

the cell comprises a nucleic acid sequence that facilitates amplification of a 
region of a plant chromosome or targets it to an amplifiable region of a plant 
chromosome. 

4. The method of claim 1 , wherein the nucleic acid introduced into 
20 the cell comprises one or more nucleic acids selected from the group consisting 

of rDNA, lambda phage DNA and satellite DNA. 

5. The method of claim 4, wherein the nucleic acid comprises plant 

rDNA. 

6. The method of claim 5, wherein the rDNA is from a plant selected 
25 from the group consisting of Arabidopsis, Nicotiana, Solanum, Lycopersicon, 

Daucus, Hordeum, Zea mays, Brassica, Triticum and Oryza. 

7. The method of claim 4, wherein the nucleic acid comprises animal 

rDNA. 

8. The method of claim 7, wherein the rDNA is mammalian rDNA. 
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9. The method of claim 4, wherein the nucleic acid comprises rDNA 
comprising sequence of an intergenic spacer region. 

10. The method, of claim 9, wherein the intergenic spacer region is 
from DNA from a plant selected from the group consisting of Arabidopsis, 

5 Sofanum, Lycopersfcon, Hordeum, Zea, Oryza, rye, wheat, radish and mung 
bean. 

1 1 . The method of claim 1, wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates identification of cells 
containing the nucleic acid. 

10 12. The method of claim 11, wherein the nucleic acid sequence 

encodes a fluorescent protein. 

1 3. The method of claim 1 2, wherein the protein is a green fluorescent 
protein. 

14. The method of claim 1, wherein the step of selecting a cell 
15 comprising an artificial chromosome comprises sorting of cells into which 

nucleic acid was introduced. 

15. The method of claim 1, wherein the step of selecting a cell 
comprising an artificial chromosome comprises fluorescent in situ hybridization 
(FISH) analysis of cells into which nucleic acid was introduced. 

20 16. The method of claim 1, wherein the one or more plant 

chromosomes contained in the cell is (are) selected from the group consisting 
of Arabidopsis, tobacco and Helianthus cells. 

17. The method of claim 16, wherein the cell is a plant protoplast. 

18. The method of claim 1, wherein the nucleic acid introduced into 
25 the cell comprises nucfeic acid encoding a selectable marker. 

19. The method of claim 18, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, kanamycin, 
hygromycin, dihydrofolate or sulfonylurea. 

20. A isolated plant artificial chromosome comprising one or more 
30 repeat regions, wherein: 



WO 2002/096923 



PCT/US2002/017451 



-200- 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
5 euchromatic and heterochromatic nucleic acid. 

21 . The plant artificial chromosome of claim 20, wherein the artificial 
chromosome is predominantly made up of one or more repeat regions. 

22. A plant cell comprising an artificial chromosome, wherein the 
artificial chromosome is produced by the method of claim 1 or claim 2. 

10 23. A method of producing a transgenic plant, comprising introducing 

the artificial chromosome of claim 20 or claim 21 into a plant cell. 

24. The method of claim 23, wherein the artificial chromosome 
comprises heterologous nucleic acid encoding a gene product, 

25. The method of claim 24, wherein the heterologous nucleic acid 
15 encodes a product selected from the group consisting of enzymes, antisense 

RNA, tRNA, rDNA, structural proteins, marker proteins, ligands, receptors, 
ribozymes, therapeutic proteins and biopharmaceutical proteins. 

26. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product selected from the group consisting of vaccines, blood 

20 factors, antigens, hormones, cytokines, growth factors and antibodies. 

27. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that provides for resistance to diseases, insects, herbicides 
or stress in the plant. 

28. The method of claim 24, wherein the heterologous nucleic acid 
25 encodes a product that provides for an agronomically important trait in the 

plant. 

29. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that alters the nutrient utilization and/or improves the 
nutrient quality of the plant. 
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30. The method of claim 24, wherein the heterologous nucleic acid is 
contained within a bacterial artificial chromosome (BAC) or a yeast artificial 
chromosome (YAC). 

31. A method of identifying plant genes encoding particular traits, 
5 comprising: 

generating an artificial chromosome comprising euchromatic DNA 
from a first species of plant; 

introducing the artificial chromosome into a plant cell of a second 
species of plant; and 
10 detecting phenotypic changes in the plant cell comprising the 

artificial chromosome and/or a plant generated from the plant cell comprising 
the artificial chromosome. 

32. The method of claim 31, wherein the artificial chromosome is a 
plant artificial chromosome or a mammalian artificial chromosome. 

15 33. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a cell comprising one or more plant 
chromosomes; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
25 34. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a plant cell; and 
selecting a plant cell comprising a SATAC. 
35. The method of claim 31, wherein the artificial chromosome is a 
30 minichromosome produced by a method comprising: 
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introducing nucleic acid into a plant cell; and 
selecting a cell comprising a minichromosome comprising a neo- 
centomere and euchromatin. 

36. The method of any of claims 33-35, wherein the nucleic acid 
5 introduced into the plant cell comprises DNA encoding a selectable marker. 

37. The method of claim 36, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, kanamycin, 
hygromycin, dihydrofolate or sulfonylurea. 

38. The method of claim 31, wherein the artificial chromosome 
10 comprising euchromatic DNA from a first plant species is produced by a method 

comprising: 

introducing into a plant cell of a first plant species an artificial 
chromosome capable of undergoing homologous recombination with the DNA 
of the first plant species; 
1 5 selecting for a recombination event between the artificial chromosome 

and the DNA of the first plant species; and 

selecting an artificial chromosome comprising euchromatic DNA from the 
first plant species. 

39. The method of claim 31, wherein the artificial chromosome 
20 comprising euchromatic DNA from a first plant species is produced by a method 

comprising: 

introducing into a plant cell of a first species an artificial chromosome 
capable of undergoing site-specific recombination with the DNA of the first plant 
species; 

25 selecting for a site-specific recombination event between the artificial 

chromosome and the DNA of the first plant species, and 

selecting an artificial chromosome comprising euchromatic DNA from the 
first plant species. 

40. The method of claim 39, wherein the DNA of the plant cell of a 
30 first species is modified to comprise a site-specific recombination sequence. 
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41 . The method of claim 39, wherein the artificial chromosome 
comprises a site-specific recombination sequence. 

42. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence and 

5 the artificial chromosome comprises a site-specific recombination sequence. 

43. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence and 
the artificial chromosome comprises a site-specific recombination sequence that 
is complementary to the site-specific recombination sequence of the plant cell 

10 of a first plant species. 

44. The method of claim 39, wherein the site-specific recombination 
is catalyzed by a recombinase enzyme. 

45. A method for producing an acrocentric plant chromosome, 
comprising: 

15 introducing a first nucleic acid comprising a site-specific 

recombination site into a first chromosome of a plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into a second chromosome of the plant cell; 

introducing a recombinase activity into the plant cell, wherein the 
20 activity catalyzes recombination between the first and second chromosomes 
and whereby an acrocentric plant chromosome is produced. 

46. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome. 

47. The method of claim 45, wherein the second nucleic acid is 
25 introduced into the distal end of the arm of the second chromosome. 

48. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome and the 
second nucleic acid is introduced into the distal end of the arm of the second 
chromosome. 
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49. A method for producing an acrocentric plant chromosome, 
comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site into the pericentric heterochromatin of a chromosome in a 
5 plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into the distal end of the chromosome, wherein the first and 
second recombination sites are located on the same arm of the chromosome; 

introducing a recombinase activity into the cell, wherein the 
10 activity catalyzes recombination between the first and second recombination 
sites in the chromosome and whereby an acrocentric plant chromosome is 
produced. 

50. A method for producing an acrocentric plant chromosome, 
comprising: 

15 introducing nucleic acid comprising a recombination site adjacent 

to nucleic acid encoding a selectable marker into a first plant cell; 

generating a first transgenic plant from the first plant cell; 
introducing nucleic acid comprising a promoter functional in a plant 
cell, a recombination site and a recombinase coding region in operative linkage 
20 into a second plant cell; 

generating a second transgenic plant from the second plant cell; 
crossing the first and second plants; 

obtaining plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker; and 
25 selecting a resistant plant that contains cells comprising an 

acrocentric plant chromosome. 

5 1 . The method of any of claims 45-50, wherein the DNA of the short 
arm of the acrocentric chromosome contains less than 5% euchromatic DNA. 

52. The method of any of claims 45-50, wherein the DNA of the short 
30 arm of the acrocentric chromosome contains less than 1 % euchromatic DNA. 
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53. The method of any of claims 45-50, wherein the short arm of the 
acrocentric chromosome does not contain euchromatic DNA. 

54. The method of any of claims 45-49, wherein the nucleic acid 
introduced into a chromosome comprises nucleic acid encoding a selectable 

5 marker. 

55. An acrocentric plant artificial chromosome, wherein the short arm 
of the acrocentric chromosome does not contain euchromatic DNA. 

56. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant acrocentric chromosome in a 

10 cell, wherein the short arm of the acrocentric chromosome does not contain 
euchromatic DNA; 

culturing the cell through at least one cell division; and 
selecting a cell comprising an artificial chromosome, is 
predominantly heterochromatic. 
15 57. The method of claim 56, wherein the acrocentric chromosome is 

produced by the method of any of claims 45-49. 

58. A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

25 the common nucleic acid sequences comprise sequences that 

represent euchromatic and heterochromatic nucleic acid. 

59. The method of claim 4, wherein the nucleic acid comprises plant 
rDNA from a dicot plant species. 

60. The method of claim 4, wherein the nucleic acid comprises plant 
30 rDNA from a monocot plant species. 
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61. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a Nicotians plant. 

62. The method of claim 9, wherein the rDNA is plant rDNA. 

63. The method of claim 62, wherein the plant is a dicot plant species. 
5 64. The method of claim 62, wherein the plant is a monocot plant 

species. 

65. The method of claim 1, wherein the cell is a dicot plant cell. 

66. The method of claim 1 , wherein the cell is a monocot plant cell. 

67. An isolated plant artificial chromosome comprising one or more 
10 repeat regions, wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that 
15 represent euchromatic and heterochromatic nucleic acid. 

68. The method of claim 31, wherein the artificial chromosome is 
produced by a method comprising: 

introducing nucleic acid into a plant cell; and 
selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that represent 
euchromatic and heterochromatic nucleic acid. 
25 69. The method of claim 44, wherein the recombinase is selected from 

the group consisting of a bacteriophage P1 Cre recombinase, a yeast R 
recombinase and a yeast FLP recombinase. 

70. The method of claim 50, further comprising selecting first and 
second transgenic plants wherein: 
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one of the plants comprises a chromosome comprising a 
recombination site located on a short arm of the chromosome in a region 
adjacent to the pericentric heterochromatin; and 

the other plant comprises a chromosome comprising a 
5 recombination site located in rDNA of the chromosome. 

71 . The method of claim 70, wherein the recombination sites on the 
two chromosomes are in the same orientation. 

72. A method for producing an acrocentric plant chromosome, 
comprising: 

10 introducing nucleic acid comprising two site-specific recombination 

sites into a cell comprising one or more plant chromosomes; 

introducing a recombinase activity into the cell, wherein the 

activity catalyzes recombination between the two recombination sites, whereby 

a plant acrocentric chromosome is produced. 
15 73. The method of claim 72, wherein the two site-specific 

recombination sites are contained on separate nucleic acid fragments. 

74. The method of claim 73, wherein the separate nucleic acid 
fragments are introduced into the cell simultaneously or sequentially. 

75. The method of claim 56, wherein the artificial chromosome is 
20 predominantly heterochromatic. 

76. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant chromosome in a cell, wherein 

the chromosome contains adjacent regions of rDNA and heterochromatic DNA; 
culturing the cell through at least one cell division; and 
25 selecting a cell comprising an artificial chromosome. 

77. The method of claim 76, wherein the artificial chromosome is 
predominantly heterochromatic. 

78. The method of claim 76 or claim 77, wherein the plant 
chromosome into which the nucleic acid is introduced is an acrocentric 

30 chromosome. 
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79. The method of claim 78, wherein the short arm of the 
chromosome contains adjacent regions of rDNA and heterochromatic DNA. 

80. The method of any of claims 76-79, wherein the heterochromatic 
DNA is pericentric heterochromatin. 

5 81. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
animal cells in the presence of an agent normally toxic to the animal cells; and 
wherein the agent is not toxic to plant cells; 
10 a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a region 
of a plant chromosome or targets the vector to an amplifiable region of a plant 
chromosome. 

82. The vector of claim 81 , wherein the amplifiable region comprises 
15 heterochromatic nucleic acid. 

83. The vector of claim 81 , wherein the amplifiable region comprises 

rDNA. 

84. The vector of claim 81 , wherein the sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the vector 

20 to an amplifiable region of a plant chromosome comprises a sufficient portion 
of an intergenic spacer region of rDNA to facilitate amplification or effect the 
targeting. 

85. The vector of claim 84, wherein the sufficient portion contains at 
least 14, 20, 30, 50, 100, 150, 300 or 500 contiguous nucleotides from an 

25 intergenic spacer region. 

86. The vector of claim 81 , wherein the selectable marker encodes a 
product that confers resistance to zeomycin. 

88. The vector of claim 81 , wherein the recognition site comprises an 
att site. 

30 89. The vector claim 81, that is pAglla or pAgllb. 
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90. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
animal cells in the presence of an agent normally toxic to the animal cells; and 
5 wherein the agent is not toxic to plant cells; 

a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 

9 1 . The vector of claim 90, wherein the recognition site comprises an 
att site. 

10 92. The vector of claim 90, further comprising a sequence of 

nucleotides that facilitates amplification of a region of a plant chromosome or 
targets the vector to an amplifiable region of a plant chromosome. 

93. The vector of claim 90, wherein the promoter is nopaiine synthase 
(HQS) or CaMV35S. 

15 94. The vector of claim 93 that is pAgl or pAg 2. 

95. The vector of claim 92, wherein the amplifiable region comprises 
heterochromatic nucleic acid. 

96. The vector of claim 92, wherein the amplifiable region comprises 

rDNA. 

20 97. The vector of claim 96, wherein the sequence of nucleotides that 

facilitates amplification of a region of a plant chromosome or targets the vector 
to an amplifiable region of a plant chromosome comprises a sufficient portion 
of an intergenic spacer region of rDNA to effect the amplification or the 
targeting. 

25 98. The vector of claim 90, wherein the protein is a selectable marker 

that permits growth of plant cells in the presence of an agent normally toxic to 
the plant cells. 

99. The vector of claim 98, wherein the selectable marker confers 
resistance to hygromycin or to phosphothricin. 
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100. The vector of claim 90, wherein the protein is a fluorescent 
protein. 

101 . The vector of claim 90, wherein the fluorescent protein is selected 
from the group consisting of green, blue and red fluorescent proteins. 

5 102. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
plant cells in the presence of an agent normally toxic to the plant cells; and 
wherein the agent is not toxic to animal cells; 
10 a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 
103. A vector, comprising: 

a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a region 
15 of a plant chromosome or targets the vector to an amplifiable region of a plant 
chromosome, wherein the plant is selected from the group consisting of 
Arabidopsis, Nicotiana, Sofanum, Lycopersicon, Daucus, Hordeum, Zeamays, 
Brassica, Triticum, Helianthus, Glycine, soybean, Gossypium, cotton, 
Helianthus, sunflower and Oryza. 
20 104. The vector of claim 103, wherein the recognition site comprises 

an att site. 

105. A cell, comprising a vector of any of claims 81-104. 

106. The cell of claim 105 that is a plant cell. 
25 107. A method, comprising: 

introducing a vector of claim 90 into a cell, wherein: 
the cell comprises an animal platform ACes that contains a recognition site that 
recombines with the recognition site in the vector in the presences of the 
recombinase therefor, thereby incorporating the selectable marker that is not 
30 operably associated with any promoter and the nucleic acid encoding a protein 
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operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. 

1 08. The method of claim 1 07, wherein the recombination sites are att 

sites. 

5 109. The method of claim 107, wherein the animal is a mammal. 

110. The method of claim 107, wherein the platform ACes comprises 
a promoter that upon recombination is operably linked to the selectable marker 
that in the vector is not operably associated with a promoter. 

111. The method of any of claims 107-110, further comprising, 
10 transferring the resulting platform ACes into a plant cell to produce a plant cell 

the compriese the platform Aces. 

1 1 2. The method of claim 111, wherein the resulting platform ACes is 
isolated prior to transfer. 

1 1 3. The method of claim 111, wherein the isolated ACes is introduced 
1 5 into a plant cell by a method selected from the group consisting of protoplast 

transfection, lipid-mediated delivery, liposomes, electroporation, sonoporation, 
microinjection, particle bombardment, silicon carbide whisker-mediated 
transformation, polyethylene glycol (PEG)-mediated DNA uptake, lipof ection and 
lipid-mediated carrier systems. 
20 114. The method of claim 111, wherein the resulting platform ACes is 

transferred by fusion of the cells. 

1 1 5. The method of claim 111, wherein the cells are plant protoplasts. 

116. The method of any of claim 107, wherein the cell is an animal 

cell. 

25 117. The method of claim 1 1 6, wherein the animal cell is a mammalian 

cell. 

1 1 8. The method of claim 111, further comprising culturing the plant 
cell that comprises the platform Aces under conditions whereby the protein 
encoded by the nucleic acid that is operably linked to a plant promoter is 
30 expressed. 
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119. A method, comprising: 

introducing a vector of claim 81 into a plant cell; 
culturing the plant cells; and 

selecting a plant cell comprising an artif icial chromosome that comprises 
5 one or more repeat regions. 

1 20. The method of claim 119, wherein sufficient portion of the vector 
integrates into a chromosome in the plant cell to result in amplification of 
chromosomal DNA. 

121. The method of claim 1 19 or claim 120, wherein: 

10 one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
15 1 22. The method of claim 119, further comprising isolating the artificial 

chromosome. 

123. A method, comprising: 

introducing a vector into a cell, wherein: 

i) the vector comprises: 

20 a) nucleic acid encoding a selectable marker that is 

not operably associated with any promoter, wherein the selectable 
marker permits growth of animal cells in the presence of an agent 
normally toxic to the animal cells; and wherein the agent is not 
toxic to plant cells; 

25 b) a recognition site for recombination; and 

c) nucleic acid encoding a protein operably linked to 
an animal promoter; 

ii) the cell comprises: 

a platform plant artifical chromosome (PAC) that comprises 
30 a recombination site and an animal promoter that upon 
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recombination is operably linked to the selectable marker that, in 
the vector, is not operably associated with a promoter; 

iii) introduction is effected under conditions whereby the 
vector recombines with the PAC to produce a plant platform PAC that contains 
5 the selectable marker operably linked to the promoter; and 

culturing the resulting cell under conditions, whereby the protein encoded 
by nucleic acid operably linked to an animal promoter is expressed. 

1 24. The method of claim 119, wherein the artificial chromosome is an 

ACes. 

10 125. The method of claim 123, wherein the plant platform PAC is an 

ACes. 

126. The method of claim 1, wherein the nucleic acid introduced into 
the cell comprises nucleic acid encoding a selectable marker. 

1 27. The vector of claim 81 , further comprising one or more selectable 
15 markers that when expressed in the plant cell permit the selection of the cell. 
128. A plant transformation vector, comprising: 
a recognition site for recombination; 

a sequence of nucleotides that facilitates amplification of a region 
of a plant chromosome or targets the vector to an amplifiable region of a plant 
20 chromosome; and 

one or more selectable markers that when expressed in a plant cell 
permit the selection of the cell; wherein 

the plant transformation vector is for Agrobacterium-medxated 
transformation of plants. 
25 1 29. A method of producing a plant artificial chromosome, comprising: 

introducing the vector of any of claims 81 , 1 27 and 1 28 into a cell 
comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that comprises 
one or more repeat regions; wherein 
30 one or more nucleic acid units is (are) repeated in a repeat region; 
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repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 
5 1 30. A method of producing a plant artificial chromosome, comprising: 

introducing the vector of any of claims 81 , 1 27 and 1 28 into a cell 
comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that comprises 
one or more repeat regions; wherein 
10 one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
15 131. The method of claim 123, wherein the cell into which the vector 

is introduced is an animal cell. 

132. The method of claim 131, wherein the cell is a mammalian cell. 
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AMENDED CLAIMS 

[received by the International Bureau on 24 December 2002 (24.12.02); 
original claims 3, 9, 16, 20, 35, 52, 56, 80, 101, 105, 107, 111, 116, 123 and 128-132 amended; 
remaining claims unchanged (17 pages)] 

What is Claimed: 

1 . A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a cell comprising one or more plant 

chromosomes; and 
5 selecting a cell comprising an artificial chromosome that 

comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat 

region; 

10 repeats of a nucleic acid unit have common nucleic acid 

sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 

2. The method of claim 1 , wherein the artificial chromosome is 
1 5 predominantly made up of one or more repeat regions. 

3. The method of claim 1, wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates amplification of a 
region of a plant chromosome or that targets the nucleic acid to an 
amplifiable region of a plant chromosome. 

20 4. The method of claim 1 , wherein the nucleic acid introduced into 

the cell comprises one or more nucleic acids selected from the group 
consisting of rDNA, lambda phage DNA and satellite DNA. 

5. The method of claim 4, wherein the nucleic acid comprises 
plant rDNA. 

25 6. The method of claim 5, wherein the rDNA is from a plant 

selected from the group consisting of Arabidopsis, Nicotiana, So/anum, 
Lycopersicon, Daucus, Hordeum, Zea mays, Brassica, Triticum and Oryza. 

7. The method of claim 4, wherein the nucleic acid comprises 
animal rDNA. 

30 8. The method of claim 7, wherein the rDNA is mammalian rDNA. 
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9. The method of claim 4, wherein the nucleic acid comprises 
rDNA comprising a sequence of an intergenic spacer region. 

10. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a plant selected from the group consisting of Arabidopsis, 

5 Soianum, Lycopersicon , Hordeum, Zea, Oryza, rye, wheat, radish and mung 
bean. 

1 1 . The method of claim 1 , wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates identification of 
cells containing the nucleic acid. 

10 12. The method of claim 1 1, wherein the nucleic acid sequence 

encodes a fluorescent protein. 

13. The method of claim 1 2, wherein the protein is a green 
fluorescent protein. 

14. The method of claim 1, wherein the step of selecting a cell 
15 comprising an artificial chromosome comprises sorting of cells into which 

nucleic acid was introduced. 

15. The method of claim 1 , wherein the step of selecting a cell 
comprising an artificial chromosome comprises fluorescent in situ 
hybridization (FISH) analysis of cells into which nucleic acid was introduced. 

20 16. The method of claim 1, wherein the one or more plant 

chromosomes contained in the cell is (are) selected from the group consisting 
of Arabidopsis, tobacco and Heiianthus chromosomes. 

17. The method of claim 16, wherein the cell is a plant protoplast. 

18. The method of claim 1, wherein the nucleic acid introduced into 
25 the cell comprises nucleic acid encoding a selectable marker. 

19. The method of claim 18, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, 
kanamycin, hygromycin, dihydrofolate or sulfonylurea. 

20. An isolated plant artificial chromosome comprising one or more 
30 repeat regions, wherein: 
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one or more nucleic acid units is (are) repeated in a repeat 

region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

5 the repeat region(s) contain substantially equivalent amounts of 

euchromatic and heterochromatic nucleic acid. 

21. The plant artificial chromosome of claim 20, wherein the artificial 
chromosome is predominantly made up of one or more repeat regions. 

22. A plant cell comprising an artificial chromosome, wherein the 
10 artificial chromosome is produced by the method of claim 1 or claim 2. 

23. A method of producing a transgenic plant, comprising 
introducing the artificial chromosome of claim 20 or claim 21 into a plant cell. 

24. The method of claim 23, wherein the artificial chromosome 
comprises heterologous nucleic acid encoding a gene product. 

15 25. The method of claim 24, wherein the heterologous nucleic acid 

encodes a product selected from the group consisting of enzymes, antisense 
RNA, tRNA, rDNA, structural proteins, marker proteins, ligands, receptors, 
ribozymes, therapeutic proteins and biopharmaceutical proteins. 

26. The method of claim 24, wherein the heterologous nucleic acid 
20 encodes a product selected from the group consisting of vaccines, blood 

factors, antigens, hormones, cytokines, growth factors and antibodies. 

27. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that provides for resistance to diseases, insects, herbicides 
or stress in the plant. 

25 28. The method of claim 24, wherein the heterologous nucleic acid 

encodes a product that provides for an agronomically important trait in the 
plant. 

29. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that alters the nutrient utilization and/or improves the 
30 nutrient quality of the plant. 
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30. The method of claim 24, wherein the heterologous nucleic acid 
is contained within a bacterial artificial chromosome (BAC) or a yeast 
artificial chromosome (YAC). 

31 . A method of identifying plant genes encoding particular traits, 
5 comprising: 

generating an artificial chromosome comprising euchromatic 
DNA from a first species of plant; 

introducing the artificial chromosome into a plant cell of a 
second species of plant; and 
10 detecting phenotypic changes in the plant cell comprising the 

artificial chromosome and/or a plant generated from the plant cell comprising 
the artificial chromosome. 

32. The method of claim 31, wherein the artificial chromosome is a 
plant artificial chromosome or a mammalian artificial chromosome. 

15 33. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a cell comprising one or more plant 
chromosomes; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
25 34. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a plant cell; and 
selecting a plant cell comprising a SATAC. 
35. The method of claim 31, wherein the artificial chromosome is a 
30 minichromosome produced by a method comprising: 
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introducing nucleic acid into a plant cell; and 
selecting a cell comprising a minichromosome comprising a 
neo-centromere and euchromatin. 

36. The method of any of claims 33-35, wherein the nucleic acid 
5 introduced into the plant cell comprises DNA encoding a selectable marker. 

37. The method of claim 36, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, 
kanamycin, hygromycin, dihydrofolate or sulfonylurea. 

38. The method of claim 31, wherein the artificial chromosome 
10 comprising euchromatic DNA from a first plant species is produced by a 

method comprising: 

introducing into a plant cell of a first plant species an artificial 
chromosome capable of undergoing homologous recombination with the DNA 
of the first plant species; 
15 selecting for a recombination event between the artificial chromosome 

and the DNA of the first plant species; and 

selecting an artificial chromosome comprising euchromatic DNA from 
the first plant species. 

39. The method of claim 31, wherein the artificial chromosome 
20 comprising euchromatic DNA from a first plant species is produced by a 

method comprising: 

introducing into a plant cell of a first species an artificial chromosome 
capable of undergoing site-specific recombination with the DNA of the first 
plant species; 

25 selecting for a site-specific recombination event between the artificial 

chromosome and the DNA of the first plant species, and 

selecting an artificial chromosome comprising euchromatic DNA from 
the first plant species. 

40. The method of claim 39, wherein the DNA of the plant cell of a 
30 first species is modified to comprise a site-specific recombination sequence. 
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41. The method of claim 39, wherein the artificial chromosome 
comprises a site-specific recombination sequence. 

42. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence 

5 and the artificial chromosome comprises a site-specific recombination 
sequence. 

43. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence 
and the artificial chromosome comprises a site-specific recombination 

10 sequence that is complementary to the site-specific recombination sequence 
of the plant cell of a first plant species. 

44. The method of claim 39, wherein the site-specific 
recombination is catalyzed by a recombinase enzyme. 

45. A method for producing an acrocentric plant chromosome, 
15 comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site into a first chromosome of a plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into a second chromosome of the plant cell; 
20 introducing a recombinase activity into the plant cell, wherein 

the activity catalyzes recombination between the first and second 
chromosomes and whereby an acrocentric plant chromosome is produced. 

46. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome. 

25 47. The method of claim 45, wherein the second nucleic acid is 

introduced into the distal end of the arm of the second chromosome. 

48. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome and 
the second nucleic acid is introduced into the distal end of the arm of the 

30 second chromosome. 
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49. A method for producing an acrocentric plant chromosome, 
comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site into the pericentric heterochromatin of a chromosome in a 
5 plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into the distal end of the chromosome, wherein the first and 
second recombination sites are located on the same arm of the chromosome; 

introducing a recombinase activity into the cell, wherein the 
10 activity catalyzes recombination between the first and second recombination 
sites in the chromosome and whereby an acrocentric plant chromosome is 
produced. 

50. A method for producing an acrocentric plant chromosome, 
comprising: 

15 introducing nucleic acid comprising a recombination site adjacent 

to nucleic acid encoding a selectable marker into a first plant cell; 

generating a first transgenic plant from the first plant cell; 
introducing nucleic acid comprising a promoter functional in a 
plant cell, a recombination site and a recombinase coding region in operative 
20 linkage into a second plant cell; 

generating a second transgenic plant from the second plant cell; 
crossing the first and second plants; 

obtaining plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker; and 
25 selecting a resistant plant that contains cells comprising an 

acrocentric plant chromosome. 

51. The method of any of claims 45-50, wherein the DNA of the 
short arm of the acrocentric chromosome contains less than 5% euchromatic 
DNA. 

30 52. The method of claim 51 , wherein the DNA of the short arm of the 

acrocentric chromosome contains less than 1 % euchromatic DNA. 
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53. The method of any of claims 45-50, wherein the short arm of the 
acrocentric chromosome does not contain euchromatic DNA. 

54. The method of any of claims 45-49, wherein the nucleic acid 
introduced into a chromosome comprises nucleic acid encoding a selectable 

5 marker. 

55. An acrocentric plant artificial chromosome, wherein the short arm 
of the acrocentric chromosome does not contain euchromatic DNA. 

56. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant acrocentric chromosome in a 

10 cell, wherein the short arm of the acrocentric chromosome does not contain 
euchromatic DNA; 

culturing the cell through at least one cell division; and 
selecting a cell comprising an artificial chromosome that is 
predominantly heterochromatic. 
15 57. The method of claim 56, wherein the acrocentric chromosome is 

produced by the method of any of claims 45-49. 

58. A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a plant cell; and 
selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 



repeats of a nucleic acid unit have common nucleic acid 
25 sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 

59. The method of claim 4, wherein the nucleic acid comprises plant 
rDNA from a dicot plant species. 
30 60. The method of claim 4, wherein the nucleic acid comprises plant 

rDNA from a monocot plant species. 
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61. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a Nicotiana plant. 

62. The method of claim 9, wherein the rDNA is plant rDNA. 

63. The method of claim 62, wherein the plant is a dicot plant 
5 species. 

64. The method of claim 62, wherein the plant is a monocot plant 
species. 

65. The method of claim 1, wherein the cell is a dicot plant cell. 

66. The method of claim 1 , wherein the cell is a monocot plant cell. 
10 67. An isolated plant artificial chromosome comprising one or more 

repeat regions, wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
15 sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 

68. The method of claim 31, wherein the artificial chromosome is 
produced by a method comprising: 

20 introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 

comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 

sequences; and 

25 the common nucleic acid sequences comprise sequences that represent 

euchromatic and heterochromatic nucleic acid. 

69. The method of claim 44, wherein the recombinase is selected 
from the group consisting of a bacteriophage P1 Cre recombinase, a yeast R 
recombinase and a yeast FLP recombinase. 

30 70. The method of claim 50, further comprising selecting first and 

second transgenic plants wherein: 
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one of the plants comprises a chromosome comprising a 
recombination site located on a short arm of the chromosome in a region 
adjacent to the pericentric heterochromatin; and 

the other plant comprises a chromosome comprising a 
5 recombination site located in rDNA of the chromosome. 

71 . The method of claim 70, wherein the recombination sites on the 
two chromosomes are in the same orientation. 

72. A method for producing an acrocentric plant chromosome, 
comprising: 

10 introducing nucleic acid comprising two site-specific 

recombination sites into a cell comprising one or more plant chromosomes; 

introducing a recombinase activity into the cell, wherein the 

activity catalyzes recombination between the two recombination sites, whereby 

a plant acrocentric chromosome is produced. 
15 73. The method of claim 72, wherein the two site-specific 

recombination sites are contained on separate nucleic acid fragments. 

74. The method of claim 73, wherein the separate nucleic acid 
fragments are introduced into the cell simultaneously or sequentially. 

75. The method of claim 56, wherein the artificial chromosome is 
20 predominantly heterochromatic. 

76. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant chromosome in a cell, 

wherein the chromosome contains adjacent regions of rDNA and 
heterochromatic DNA; 
25 culturing the cell through at least one cell division; and 

selecting a cell comprising an artificial chromosome. 

77. The method of claim 76, wherein the artificial chromosome is 
predominantly heterochromatic. 

78. The method of claim 76 or claim 77, wherein the plant 
30 chromosome into which the nucleic acid is introduced is an acrocentric 

chromosome. 
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79. The method of claim 78, wherein the short arm of the 
chromosome contains adjacent regions of rDNA and heterochromatic DNA. 

80. The method of claim 76, 77, or 79, wherein the 
heterochromatic DNA is pericentric heterochromatin. 

5 81. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth 
of animal cells in the presence of an agent normally toxic to the animal cells; 
and wherein the agent is not toxic to plant cells; 
10 a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a 
region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome. 

82. The vector of claim 81, wherein the amplifiable region 
15 comprises heterochromatic nucleic acid. 

83. The vector of claim 81, wherein the amplifiable region 
comprises rDNA. 

84. The vector of claim 81, wherein the sequence of nucleotides 
that facilitates amplification of a region of a plant chromosome or targets the 

20 vector to an amplifiable region of a plant chromosome comprises a sufficient 
portion of an intergenic spacer region of rDNA to facilitate amplification or 
effect the targeting. 

85. The vector of claim 84, wherein the sufficient portion contains 
at least 14, 20, 30, 50, 100, 150, 300 or 500 contiguous nucleotides from 

25 an intergenic spacer region. 

86. The vector of claim 81, wherein the selectable marker encodes 
a product that confers resistance to zeomycin. 

87. A plant transformation vector, comprising: 
a recognition site for recombination; 

30 a sequence of nucleotides that facilitates amplification of a 
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region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome; and 

one or more selectable markers that when expressed in a plant 
cell permit the selection of the cell; wherein 
5 the plant transformation vector is for Agrobacterium-rr\ed\ate6 

transformation of plants. 

88. The vector of claim 81, wherein the recognition site comprises 
an att site. 

89. The vector claim 81, that is pAglla or pAgllb. 
10 90. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits. growth 
of animal cells in the presence of an agent normally toxic to the animal, cells; 
and wherein the agent is not toxic to plant cells; 
15 a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 

91. The vector of claim 90, wherein the recognition site comprises 
an att site. 

92. The vector of claim 90, further comprising a sequence of 

20 nucleotides that facilitates amplification of a region of a plant chromosome or 
targets the vector to an amplifiable region of a plant chromosome. 

93. The vector of claim 90, wherein the promoter is nopaline 
synthase (NOS) or CaMV35S. 

94. The vector of claim 93 that is pAg1 or pAg 2. 

25 95. The vector of claim 92, wherein the amplifiable region 

comprises heterochromatic nucleic acid. 

96. The vector of claim 92, wherein the amplifiable region 
comprises rDNA. 

97. The vector of claim 96, wherein the sequence of nucleotides 
30 that facilitates amplification of a region of a plant chromosome or targets the 
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vector to an amplifiable region of a plant chromosome comprises a sufficient 
portion of an intergenic spacer region of rDNA to effect the amplification or 
the targeting. 

98. The vector of claim 90, wherein the protein is a selectable 
5 marker that permits growth of plant cells in the presence of an agent 

normally toxic to the plant cells. 

99. The vector of claim 98, wherein the selectable marker confers 
resistance to hygromycin or to phosphothricin. 

100. The vector of claim 90, wherein the protein is a fluorescent 
10 protein. 

101. The vector of claim 100, wherein the fluorescent protein is 
selected from the group consisting of green, blue and red fluorescent proteins. 

102. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
15 associated with any promoter, wherein the selectable marker permits growth 
of plant cells in the presence of an agent normally toxic to the plant cells; and 
wherein the agent is not toxic to animal cells; 

a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 
20 103. A vector, comprising: 

a recognition site for recombination; and 
a sequence of nucleotides that facilitates amplification of a 
region of a plant chromosome or targets the vector to an amplifiable region of 
a plant chromosome, wherein the plant is selected from the group consisting 
25 of Arabidopsis, Nicotiana, Sofanum, Lycopersicon, Daucus, Hordeum, Zea 
mays, Brass/ca, Triticum, Helianthus, Glycine, soybean, Gossypium, cotton, 
Hefianthus, sunflower and Oryza, 

104. The vector of claim 103, wherein the recognition site comprises 
an art site. 

30 105. A cell, comprising a vector of any of claims 81-86 and 88-104. 

106. The cell of claim 105 that is a plant cell. 
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107. A method, comprising: 

introducing a vector of claim 90 into a cell, wherein: 
the cell comprises an animal platform ACes that contains a recognition site 
that recombines with the recognition site in the vector in the presence of the 
5 recombinase therefor, thereby incorporating the selectable marker that is not 
operably associated with any promoter and the nucleic acid encoding a protein 
operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. 

108. The method of claim 107, wherein the recombination sites are 
10 att sites. 

109. The method of claim 107, wherein the animal is a mammal. 

1 10. The method of claim 107, wherein the platform ACes comprises 
a promoter that upon recombination is operably linked to the selectable 
marker that in the vector is not operably associated with a promoter. 

15 111. The method of any of claims 107-1 10, further comprising, 

transferring the resulting platform ACes into a plant cell to produce a plant 
cell that comprises the platform Aces. 

112. The method of claim 111, wherein the resulting platform ACes 
is isolated prior to transfer. 

20 113. The method of claim 111, wherein the isolated ACes is 

introduced into a plant cell by a method selected from the group consisting of 
protoplast transfection, lipid-mediated delivery, liposomes, electroporation, 
sonoporation, microinjection, particle bombardment, silicon carbide whisker- 
mediated transformation, polyethylene glycol (PEG)-mediated DNA uptake, 

25 lipofection and lipid-mediated carrier systems. 

114. The method of claim 111, wherein the resulting platform ACes 
is transferred by fusion of the cells. 

115. The method of claim 111, wherein the cells are plant 
protoplasts. 

30 116. The method of claim 107, wherein the cell is an animal cell. 



AMENDED SHEET (ARTICLE 19) 



WO 2002/096923 



229 



PCT/US2002/017451 



117. The method of claim 1 16, wherein the animal cell is a 
mammalian cell. 

118. The method of claim 111, further comprising culturing the plant 
cell that comprises the platform Aces under conditions whereby the protein 

5 encoded by the nucleic acid that is operably linked to a plant promoter is 
expressed. 

119. A method, comprising: 

introducing a vector of claim 81 into a plant cell; 
culturing the plant cells; and 
10 selecting a plant cell comprising an artificial chromosome that comprises 

one or more repeat regions. 

1 20. The method of claim 119, wherein sufficient portion of the vector 
integrates into a chromosome in the plant cell to result in amplification of 
chromosomal DNA. 

15 121. The method of claim 1 19 or claim 1 20, wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

20 the repeat region(s) contain substantially equivalent amounts of 

euchromatic and heterochromatic nucleic acid. 

122. The method of claim 119, further comprising isolating the 
artificial chromosome. 

123. A method, comprising: 

25 introducing a vector into a cell, wherein: 

i) the vector comprises: 

a) nucleic acid encoding a selectable marker that is 
not operably associated with any promoter, wherein the 
selectable marker permits growth of animal cells in the presence 
30 of an agent normally toxic to the animal cells; and wherein the 

agent is not toxic to plant cells; 
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b) a recognition site for recombination; and 

c) nucleic acid encoding a protein operably linked to 
an animal promoter; 

ii) the cell comprises: 
5 a platform plant artificial chromosome (PAC) that 

comprises a recombination site and an animal promoter that upon 
recombination is operably linked to the selectable marker that, in 
the vector, is not operably associated with a 

promoter; 

10 iii) introduction is effected under conditions whereby 

the vector recombines with the PAC to produce a plant platform PAC that 
contains the selectable marker operably linked to the promoter; and 

culturing the resulting cell under conditions, whereby the protein 
encoded by nucleic acid operably linked to an animal promoter is expressed. 

15 124. The method of claim 119, wherein the artificial chromosome is an 

ACes. 

125. The method of claim 123, wherein the plant platform PAC is an 

ACes. 

126. The method of claim 1, wherein the nucleic acid introduced into 
20 the cell comprises nucleic acid encoding a selectable marker. 

127. The vector of claim 81, further comprising one or more selectable 
markers that when expressed in the plant cell permit the selection of the cell. 

1 28. A method of producing a plant artificial chromosome, comprising: 
introducing the vector of claim 81, 87 or 127 into a cell 

25 comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that 
comprises one or more repeat regions; wherein 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid sequences; and 
30 the common nucleic acid sequences comprise sequences that 

represent euchromatic and heterochromatic nucleic acid. 
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129. A method of producing a plant artificial chromosome, comprising: 
introducing the vector of claim 81, 87 or 127 into a cell 

comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that 
5 comprises one or more repeat regions; wherein 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 

sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
10 euchromatic and heterochromatic nucleic acid. 

130. The method of claim 1 23, wherein the cell into which the vector 
is introduced is an animal cell. 

1 31 . The method of claim 1 30, wherein the cell is a mammalian cell. 

132. The method of claim 78, wherein the heterochromatic DNA is 
15 pericentric heterochromatin. 
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SEQUENCE LISTING 

<110> CHROMOS MOLECULAR SYSTEMS , INC. 
Perez, Carl 
Fabijanski, Steven 
Perkins, Edward 

<120> Plant Artificial Chromosomes, Uses thereof, and Methods of Preparing 
Plant Artificial Chromosomes 

<130> 24601-419PC 

<140> Not Yet Assigned 
<141> Herewith 

<150> US 60/294,687 
<151> 2001-05-30 

<150> US 60/296,329 
<151> 2001-06-04 

<160> 51 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 11182 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAgl plasmid 
<400> 1 

catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc ctccgctgct 60 
atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca 120 
agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt 180 
gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac gccatgaaca 240 
agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac caggacttga 300 
ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc gagaagatca 360 
ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta cgccctggcg 420 
acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac ctactggaca 480 
ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg 540 
acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg 600 
agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg 660 
tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc cgcgagctga 720 
tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg catcgctcga 780 
ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg 840 
gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac 900 
gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt ttttcattac 960 
cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt 1020 
ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc tggcggcctg 10 80 
gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt gatgtgtatt 114 0 
tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag taaataaaca 12 00 
aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg cgggtcaggc 1260 
aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc cgatgttctg 132 0 
ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa 13 80 
ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa ggccatcggc 144 0 
cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg 1500 
atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga catatgggcc 1560 
accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg aaggctacaa 1620 
gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag 1680 
gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg cgtgagctac 174 0 
ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc 1800 
cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt taatgaggta 1860 
aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc gcacgcagca 1920 
gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg gtcaactttc 1980 
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agttgccggc 
ttaccgagct 
atgagtagat 
accgacgccg 
tgggttgtct 
cggtcgcaaa 
gaagttgaag 
tgaatcgtgg 
cggtgcgccg 
gatgctctat 
tctgtcgaag 
cgtagaggtt 
gatggcggtt 
gcccggccgc 
tggcggaaag 
tgccatgcag 
agccttgatt 
gatcgagcta 
gacggttcac 
ggcacgccgc 
cagtggcagc 
aaatgacctg 
catgcgctac 
gatgctaggg 
tagcacgtac 
cccaaagccg 
aggcgatttt 
ctgtgcataa 
gtcgctgcgc 
aaaaatggct 
actcgaccgc 
aaaacctctg 
ggagcagaca 
tgacccagtc 
gattgtactg 
ataccgcatc 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
atattttatt 
ctgttcttcc 
gtccgccctg 
gatgttgctg 
ctttaaaaaa 
gcaatccaca 
taagctattc 
cgcatacagc 
gacgccatcg 
gacctttgga 
atcataggtg 
tcccaccagc 
tttttcgatc 
tcctcttttc 
aattcactgt 
ttttcaaagt 
caggcagcaa 



ggaggatcac 
gctatctgaa 
gaattttagc 
tggaatgccc 
gccggccctg 
ccatccggcc 
gccgcgcagg 
caagcggccg 
tcgattagga 
gacgtgggca 
cgtgaccgac 
tccgcagggc 
tcccatctaa 
gtgttccgtc 
cagaaagacg 
cgtacgaaga 
agccgctaca 
gctgattgga 
cccgattact 
gccgcaggca 
gccggagagt 
ccggagtacg 
cgcaacctga 
caaattgccc 
attgggaacc 
tacattggga 
tccgcctaaa 
ctgtctggcc 
tccctacgcc 
ggcctacggc 
cggcgcccac 
acacatgcag 
agcccgtcag 
acgtagcgat 
agagtgcacc 
aggcgctctt 
gcggtatcag 
ggaaagaaca 
ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
ttctcccaat 
ccgatatcct 
ccgcttctcc 
tctcccaggt 
tcatacagct 
tcggccagat 
gtatagggac 
tcgataatct 
gcctcactca 
acaggcagct 
gtccctttat 
ttatatacct 
agttttttca 
tacagtattt 
tccttgcatt 
tggcgtataa 
cgctctgtca 



accaagctga 
tacatcgcgc 
ggctaaagga 
catgtgtgga 
caatggcact 
cggtacaaat 
ccgcccagcg 
ctgatcgaat 
agccgcccaa 
cccgcgatag 
gagctggcga 
cggccggcat 
ccgaatccat 
cacacgttgc 
acctggtaga 
aggccaagaa 
agatcgtaaa 
tgtaccgcga 
ttttgatcga 
aggcagaagc 
tcaagaagtt 
atttgaagga 
tcgagggcga 
tagcagggga 
caaagccgta 
accggtcaca 
actctttaaa 
agcgcacagc 
ccgccgcttc 
caggcaatct 
atcaaggcac 
ctcccggaga 
ggcgcgtcag 
agcggagtgt 
atatgcggtg 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttttgtttg 
tcttttctac 
tgcattctag 
caggcttgat 
ccctgatcga 
caagatcaat 
cgccgtggga 
cgcgcggatc 
cgttattcag 
aatccgatat 
tttcagggct 
tgagcagatt 
ttccttccag 
accggctgtc 
tagcaggaga 
attccggtga 
aaagataccc 
ctaaaacctt 
catagtatcg 
tcgttacaat 



agatgtacgc 
agctaccaga 
ggcggcatgg 
ggaacgggcg 
ggaaccccca 
cggcgcggcg 
gcaacgcatc 
ccgcaaagaa 
gggcgacgag 
tcgcagcatc 
ggtgatccgc 
ggccagtgtg 
gaaccgatac 
ggacgtactc 
aacctgcatt 
cggccgcctg 
gagcgaaacc 
gatcacagaa 
tcccggcatc 
cagatggttg 
ctgtttcacc 
ggaggcgggg 
agcatccgcc 
aaaaggtcga 
cattgggaac 
catgtaagtg 
acttattaaa 
cgaagagctg 
gcgtcggcct 
accagggcgc 
cctgcctcgc 
cggtcacagc 
cgggtgttgg 
atactggctt 
tgaaataccg 
gctcactgac 
ggcggtaata 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaagg 
agttggtagc 
caagcagcag 
ggggtctgac 
gtactaaaac 
ccccagtaag 
ccggacgcag 
aaagccactt 
aaagacaagt 
tttaaatgga 
taagtaatcc 
gtcgatggag 
ttgttcatct 
gctccagcca 
ccatagcatc 
cgtcattttt 
cattccttcc 
tattctcatt 
caagaagcta 
aaataccaga 
acggagccga 
caacatgcta 



ggtacgccaa 
gtaaatgagc 
aaaatcaaga 
gttggccagg 
agcccgagga 
ctgggtgatg 
gaggcagaag 
tcccggcaac 
caaccagatt 
atggacgtgg 
tacgagcttc 
tgggattacg 
cgggaaggga 
aagttctgcc 
cggttaaaca 
gtgacggtat 
gggcggccgg 
ggcaagaacc 
ggccgttttc 
ttcaagacga 
gtgcgcaagc 
caggctggcc 
ggttcctaat 
aaaggtctct 
cggaacccgt 
actgatataa 
actcttaaaa 
caaaaagcgc 
atcgcggccg 
ggacaagccg 
gcgtttcggt 
ttgtctgtaa 
cgggtgtcgg 
aactatgcgg 
cacagatgcg 
tcgctgcgct 
cggttatcca 
aaggccagga 
gacgagcatc 
agataccagg 
cttaccggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgcgca 
gctcagtgga 
aattcatcca 
tcaaaaaata 
aaggcaatgt 
actttgccat 
tcctcttcgg 
gtgtcttctt 
aattcggcta 
tgaaagagcc 
tcatactctt 
tcatgccgtt 
atgtcctttt 
aaatataggt 
gtatctttta 
ttagccattt 
attataacaa 
aaacagcttt 
ttttgaaacc 
ccctccgcga 



ggcaagacca 
aaatgaataa 
acaaccaggc 
cgtaagcggc 
atcggcgtga 
acctggtgga 
cacgccccgg 
cgccggcagc 
ttttcgttcc 
ccgttttccg 
cagacgggca 
acctggtact 
agggagacaa 
ggcgagccga 
ccacgcacgt 
ccgagggtga 
agtacatcga 
cggacgtgct 
tctaccgcct 
tctacgaacg 
tgatcgggtc 
cgatcctagt 
gtacggagca 
ttcctgtgga 
acattgggaa 
aagagaaaaa 
cccgcctggc 
ctacccttcg 
ctggccgctc 
cgccgtcgcc 
gatgacggtg 
gcggatgccg 
ggcgcagcca 
catcagagca 
taaggagaaa 
cggtcgttcg 
cagaatcagg 
accgtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgctacaga 
gtatctgcgc 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
gtaaaatata 
gctcgacata 
cataccactt 
ctttcacaaa 
gcttttccgt 
cccagttttc 
agcggctgtc 
tgatgcactc 
ccgagcaaag 
caaagtgcag 
cccgttccac 
tttcattttc 
cgcagcggta 
attatttcct 
gacgaactcc 
ttcaaagttg 
gcggtgatca 
gatcatccgt 



2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
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gtttcaaacc 
tctgccgcct 
cgagtggtga 
tatattgtgg 
taatgtactg 
gttttaggaa 
ggtttcttat 
ggaactactc 
ggacggggcg 
ccgtgcttga 
atgcgcacgc 
gcctccaggg 
cggggggaga 
gggcccgcgt 
cgctcccgca 
aagttgaccg 
gcctcggtgg 
gagatagatt 
ttccttatat 
agtggagata 
cacgatgctc 
aacgatagcc 
tgtccttttg 
taccctttgt 
cttggagtag 
agacgtggtt 
gggaccactg 
tttgtaggtg 
atggaatccg 
gtcttctgag 
gttggcaagc 
taatgcagct 
aatgtgagtt 
atgttgtgtg 
tacgaattcg 
gagtttggac 
gatgctattg 
gaactccagc 
tccgaagccc 
gtcctgctcc 
ccgcccccac 
cgtggacacg 
ggccagggtg 
gtcccggacc 
ggtccagaac 
caacttggcc 
gcaggaattc 
accaaagggc 
attgcccagc 
aatgccatca 
ccaaagatgg 
cttcaaagca 
agaatatcaa 
taatatcggg 
cagtagaaaa 
ttcaagatgc 
tggaaaaaga 
ctgacgtaag 
aagttcattt 
tctctcgagc 
cgacgtctgt 
tctcggaggg 
tgcgggtaaa 
catcggccgc 
cctattgcat 
tgcccgctgt 
gccagacgag 



cggcagctta 
tacaacggct 
ttttgtgccg 
tgtaaacaaa 
aattaacgcc 
ttagaaattt 
atgctcaaca 
acacattatt 
gtaccggcag 
agccggccgc 
tcgggtcgtt 
acttcagcag 
cgtacacggt 
aggcgatgcc 
gacggacgag 
tgcttgtctc 
cacggcggat 
tgtagagaga 
agaggaaggt 
tcacatcaat 
ctcgtgggtg 
tttcctttat 
atgaagtgac 
tgaaaagtct 
acgagagtgt 
ggaacgtctt 
tcggcagagg 
ccaccttcct 
aggaggtttc 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaattgtgag 
agccttgact 
aaaccacaac 
ctttatttgt 
atgagatccc 
aacctttcat 
tcggccacga 
ggctgctcgc 
acctccgacc 
ttgtccggca 
acaccggcga 
tcgaccgctc 
atggatccag 
gatcgacact 
tattgagact 
tatctgtcac 
ttgcgataaa 
acccccaccc 
agtggattga 
agatacagtc 
aaacctcctc 
ggaaggtggc 
ctctgccgac 
agacgttcca 
ggatgacgca 
catttggaga 
tttcgcagat 
cgagaagttt 
cgaagaatct 
tagctgcgcc 
gctcccgatt 
ctcccgccgt 
tctacaaccg 
cgggttcggc 



gttgccgttc 
ctcccgctga 
agctgccggt 
ttgacgctta 
gaattaattc 
tattgataga 
catgagcgaa 
atggagaaac 
gctgaagtcc 
ccgcagcatg 
gggcagcccg 
gtgggtgtag 
cgactcggcc 
ggcgacctcg 
gtcgtccgtc 
gatgtagtgg 
gtcggccggg 
gactggtgat 
cttgcgaagg 
ccacttgctt 
ggggtccatc 
cgcaatgatg 
agatagctgg 
caatagccct 
cgtgctccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatattac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
cggataacaa 
agagggtcga 
tagaatgcag 
aaccattata 
cgcgctggag 
agaaggcggc 
agtgcacgca 
cgatctcggt 
actcggcgta 
ccacctggtc 
agtcgtcctc 
cggcgacgtc 
atttcgctca 
ctcgtctact 
tttcaacaaa 
ttcatcaaaa 
ggaaaggcta 
acgaggagca 
tgtgataaca 
tcagaagacc 
ggattccatt 
acctacaaat 
agtggtccca 
accacgtctt 
caatcccact 
ggacacgctg 
ccgggggggc 
ctgatcgaaa 
cgtgctttca 
gatggtttct 
ccggaagtgc 
gcacagggtg 
gtcgcggagg 
ccattcggac 



ttccgaatag 
cgccgtcccg 
cggggagctg 
gacaacttaa 
gggggatCtg 
agtattttac 
accctatagg 
tcgagtcaaa 
agctgccaga 
ccgcgggggg 
atgacagcga 
agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
ttgacgatgg 
cgtcgttctg 
ttcagcgtgt 
atagtgggat 
tgaagacgtg 
tttgggacca 
gcatttgtag 
gcaatggaat 
ttggtcttct 
catgttatca 
gatgctcctc 
gatagccttt 
ccttttgatg 
cctttgttga 
ggagtagacg 
ccgcctctcc 
tggaaagcgg 
caggctttac 
tttcacacag 
cggtatacag 
tgaaaaaaat 
agctgcaata 
gatcatccag 
ggtggaatcg 
gttgccggcc 
catggccggc 
cagctcgtcc 
ctggaccgcg 
cacgaagtcc 
gcgcgcggtg 
agttagtata 
ccaagaatat 
gggtaatatc 
ggacagtaga 
tcgttcaaga 
tcgtggaaaa 
tggtggagca 
aaagggctat 
gcccagctat 
gccatcattg 
aagatggacc 
caaagcaagt 
atccttcgca 
aaatcaccag 
aatgagatat 
agttcgacag 
gcttcgatgt 
acaaagatcg 
ttgacattgg 
tcacgttgca 
ctatggatgc 
cgcaaggaat 



catcggtaac 
gactgatggg 
ttggctggct 
taacacattg 
gattttagta 
aaatacaaat 
aaccctaatt 
tctcggtgac 
aacccacgtc 
catatccgag 
ccacgctctt 
ccagtcccgt 
aggcgttgcg 
cggcgacgag 
gttcctgcgg 
tgcagaccgc 
ggctcatggt 
cctctccaaa 
tgtgcgtcat 
gttggaacgt 
ctgtcggcag 
gtgccacctt 
ccgaggaggt 
gagactgtat 
catcaatcca 
gtgggtgggg 
cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatgct 
gaaacagcta 
acatgataag 
gctttatttg 
aacaagttgg 
ccggcgtccc 
aaatctcgta 
gggtcgcgca 
ccggaggcgt 
aggccgcgca 
ctgatgaaca 
cgggagaacc 
agcaccggaa 
aaaaagcagg 
caaagataca 
gggaaacctc 
aaaggaaggt 
tgcctctgcc 
agaagacgtt 
cgacactctc 
tgagactttt 
ctgtcacttc 
cgataaagga 
cccacccacg 
ggattgatgt 
agaccttcct 
tctctctcta 
gaaaaagcct 
cgtctccgac 
aggagggcgt 
ttatgtttat 
ggagtttagc 
agacctgcct 
gatcgctgcg 
cggtcaatac 



atgagcaaag 
ctgcctgtat 
ggtggcagga 
cggacgtttt 
ctggattttg 
acatactaag 
cccttatctg 
gggcaggacc 
atgccagttc 
cgcctcgtgc 
gaagccctgt 
ccgctggtgg 
tgccttccag 
ccagggatag 
ctcggtacgg 
cggcatgtcc 
agactcgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcatcttg 
ccttttctac 
ttcccgatat 
ctttgatatt 
cttgctttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagccctttg 
gctccaccat 
gccgattcat 
caacgcaatt 
tccggctcgt 
tgaccatgat 
atacattgat 
tgaaatttgt 
ggtgggcgaa 
ggaaaacgat 
gcacgtgtca 
gggcgaactc 
cccggaagtt 
cccacaccca 
gggtcacgtc 
cgagccggtc 
cggcactggt 
cttcaatcct 
gtctcagaag 
ctcggattcc 
ggcacctaca 
gacagtggtc 
ccaaccacgt 
gtctactcca 
caacaaaggg 
atcaaaagga 
aaggctatcg 
aggagcatcg 
gatatctcca 
ctatataagg 
caaatctatc 
gaactcaccg 
ctgatgcagc 
ggatatgtcc 
cggcactttg 
gagagcctga 
gaaaccgaac 
gccgatctta 
actacatggc 



6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
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gtgatttcat atgcgcgatt gctgatcccc atgtgtatca ctggcaaact gtgatggacg 
acaccgtcag tgcgtccgtc gcgcaggctc tcgatgagct gatgctttgg gccgaggact 
gccccgaagt ccggcacctc gtgcacgcgg atttcggctc caacaatgtc ctgacggaca 
atggccgcat aacagcggtc attgactgga gcgaggcgat gttcggggat tcccaatacg 
aggtcgccaa catcttcttc tggaggccgt ggttggcttg tatggagcag cagacgcgct 
acttcgagcg gaggcatccg gagcttgcag gatcgccacg actccgggcg tatatgctcc 
gcattggtct tgaccaactc tatcagagct tggttgacgg caatttcgat gatgcagctt 
gggcgcaggg tcgatgcgac gcaatcgtcc gatccggagc cgggactgtc gggcgtacac 
aaatcgcccg cagaagcgcg gccgtctgga ccgatggctg tgtagaagta ctcgccgata 
gtggaaaccg acgccccagc actcgtccga gggcaaagaa atagagtaga tgccgaccgg 
atctgtcgat cgacaagctc gagtttctcc ataataatgt gtgagtagtt cccagataag 
ggaattaggg ttcctatagg gtttcgctca tgtgttgagc atataagaaa cccttagtat 
gtatttgtat ttgtaaaata cttctatcaa taaaatttct aattcctaaa accaaaatcc 
agtactaaaa tccagatccc ccgaattaat tcggcgttaa ttcagatcaa gcttggcact 
ggccgtcgtt ttacaacgtc gtgactggga aaaccctggc gttacccaac ttaatcgcct 
tgcagcacat ccccctttcg ccagctggcg taatagcgaa gaggcccgca ccgatcgccc 
ttcccaacag ttgcgcagcc tgaatggcga atgctagagc agcttgagct tggatcagat 
tgtcgtttcc cgccttcagt ttaaactatc agtgtttgac aggatatatt ggcgggtaaa 
cctaagagaa aagagcgttt attagaataa cggatattta aaagggcgtg aaaaggttta 
tccgttcgtc catttgtatg tg 

<210> 2 
<211> 8428 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCambia3300 plasmid 
<400> 2 

catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc ctccgctgct 60 
atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca 120 
agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt 180 
gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac gccatgaaca 240 
agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac caggacttga 300 
ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc gagaagatca 3 60 
ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta cgccctggcg 420 
acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac ctactggaca 480 
ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg 540 
acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg 600 
agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg 660 
tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc cgcgagctga 720 
tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg catcgctcga 780 
ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg 840 
gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac 900 
gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt ttttcattac 960 
cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt 1020 
ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc tggcggcctg 1080 
gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt gatgtgtatt 1140 
tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag taaataaaca 12 00 
aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg cgggtcaggc 12 60 
aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc cgatgttctg 1320 
ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa 1380 
ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa ggccatcggc 1440 
cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg 1500 
atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga catatgggcc 1560 
accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg aaggctacaa 1620 
gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag 1680 
gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg cgtgagctac 1740 
ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc 1800 
cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt taatgaggta 1860 
aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc gcacgcagca 1920 
gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg gtcaactttc 1980 
agttgccggc ggaggatcac accaagctga agatgtacgc ggtacgccaa ggcaagacca 2040 
ttaccgagct gctatctgaa tacatcgcgc agctaccaga gtaaatgagc aaatgaataa 2100 
atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga acaaccaggc 2160 
accgacgccg tggaatgccc catgtgtgga ggaacgggcg gttggccagg cgtaagcggc 2220 



10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
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11100 
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tgggttgtct 
cggtcgcaaa 
gaagttgaag 
tgaatcgtgg 
cggtgcgccg 
gatgctctat 
tctgtcgaag 
cgtagaggtt 
gatggcggtt 
gcccggccgc 
tggcggaaag 
tgccatgcag 
agccttgatt 
gatcgagcta 
gacggttcac 
ggcacgccgc 
cagtggcagc 
aaatgacctg 
catgcgctac 
gatgctaggg 
tagcacgtac 
cccaaagccg 
aggcgatttt 
ctgtgcataa 
gtcgctgcgc 
aaaaatggct 
actcgaccgc 
aaaacctctg 
ggagcagaca 
tgacccagtc 
gattgtactg 
ataccgcatc 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
atattttatt 
ctgttcttcc 
gtccgccctg 
gatgttgctg 
ctttaaaaaa 
gcaatccaca 
taagctattc 
cgcatacagc 
gacgccatcg 
gacctttgga 
atcataggtg 
tcccaccagc 
tttttcgatc 
tcctcttttc 
aattcactgt 
ttttcaaagt 
caggcagcaa 
gtttcaaacc 
tctgccgcct 
cgagtggtga 
tatattgtgg 



gccggccctg 
ccatccggcc 
gccgcgcagg 
caagcggccg 
tcgattagga 
gacgtgggca 
cgtgaccgac 
tccgcagggc 
tcccatctaa 
gtgttccgtc 
cagaaagacg 
cgtacgaaga 
agccgctaca 
gctgattgga 
cccgattact 
gccgcaggca 
gccggagagt 
ccggagtacg 
cgcaacctga 
caaattgccc 
attgggaacc 
tacattggga 
tccgcctaaa 
ctgtctggcc 
tccctacgcc 
ggcctacggc 
cggcgcccac 
acacatgcag 
agcccgtcag 
acgtagcgat 
agagtgcacc 
aggcgctctt 
gcggtatcag 
ggaaagaaca 
ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
ttctcccaat 
ccgatatcct 
ccgcttctcc 
tctcccaggt 
tcatacagct 
tcggccagat 
gtatagggac 
tcgataatct 
gcctcactca 
acaggcagct 
gtccctttat 
ttatatacct 
agttttttca 
tacagtattt 
tccttgcatt 
tggcgtataa 
cgctctgtca 
cggcagctta 
tacaacggct 
ttttgtgccg 
tgtaaacaaa 



caatggcact 
cggtacaaat 
ccgcccagcg 
ctgatcgaat 
agccgcccaa 
cccgcgatag 
gagctggcga 
cggccggcat 
ccgaatccat 
cacacgttgc 
acctggtaga 
aggccaagaa 
agatcgtaaa 
tgtaccgcga 
ttttgatcga 
aggcagaagc 
tcaagaagtt 
atttgaagga 
tcgagggcga 
tagcagggga 
caaagccgta 
accggtcaca 
actctttaaa 
agcgcacagc 
ccgccgcttc 
caggcaatct 
atcaaggcac 
ctcccggaga 
ggcgcgtcag 
agcggagtgt 
atatgcggtg 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttttgtttg 
tcttttctac 
tgcattctag 
caggcttgat 
ccctgatcga 
caagatcaat 
cgccgtggga 
cgcgcggatc 
cgttattcag 
aatccgatat 
tttcagggct 
tgagcagatt 
ttccttccag 
accggctgtc 
tagcaggaga 
attccggtga 
aaagataccc 
ctaaaacctt 
ca tag tat eg 
tcgttacaat 
gttgccgttc 
ctcccgctga 
agetgeeggt 
ttgaegctta 



ggaaccccca 
cggcgcggcg 
gcaacgcatc 
ccgcaaagaa 
gggegacgag 
tcgcagcatc 
ggtgatccgc 
ggccagtgtg 
gaaccgatac 
ggaegtaetc 
aacctgeatt 
cggccgcctg 
gagegaaace 
gatcacagaa 
tcccggcatc 
cagatggttg 
ctgtttcacc 
ggaggcgggg 
agcatccgcc 
aaaaggtcga 
cattgggaac 
catgtaagtg 
acttattaaa 
cgaagagctg 
gcgtcggcct 
accagggcgc 
cctgcctcgc 
cggtcacagc 
cgggtgttgg 
atactggctt 
tgaaataccg 
gctcactgac 
ggeggtaata 
aggecagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagegagg 
cactagaagg 
agttggtagc 
caagcagcag 
ggggtctgac 
gtactaaaac 
ccccagtaag 
ccggacgcag 
aaagecaett 
aaagacaagt 
tttaaatgga 
taagtaatcc 
gtcgatggag 
ttgttcatct 
gctccagcca 
ccatagcatc 
cgtcattttt 
cattccttcc 
tattctcatt 
caagaagcta 
aaataccaga 
aeggagcega 
caacatgeta 
ttccgaatag 
cgccgtcccg 
eggggagctg 
gacaacttaa 



ageccgagga 
ctgggtgatg 
gaggcagaag 
tcccggcaac 
caaccagatt 
atggacgtgg 
tacgagcttc 
tgggattacg 
egggaaggga 
aagttctgcc 
eggttaaaca 
gtgaeggtat 
gggeggcegg 
ggcaagaacc 
ggccgttttc 
ttcaagacga 
gtgegcaage 
caggctggcc 
ggttcctaat 
aaaggtctct 
cggaacccgt 
actgatataa 
actcttaaaa 
caaaaagege 
atcgcggccg 
ggacaagccg 
gcgtttcggt 
ttgtctgtaa 
cgggtgtcgg 
aactatgegg 
cacagatgeg 
tcgctgcgct 
eggttatcca 
aaggccagga 
gacgagcatc 
agataccagg 
ettaceggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgegea 
gctcagtgga 
aattcatcca 
tcaaaaaata 
aaggcaatgt 
actttgecat 
tcctcttcgg 
gtgtcttctt 
aatteggcta 
tgaaagagee 
tcatactctt 
teatgeegtt 
atgtcctttt 
aaatataggt 
gtatctttta 
ttagccattt 
attataacaa 
aaacagcttt 
ttttgaaacc 
ccctccgcga 
categgtaac 
gactgatggg 
ttggctggct 
taacacattg 



atcggcgtga 
acctggtgga 
cacgccccgg 
cgccggcagc 
ttttcgttcc 
ccgttttccg 
cagaegggea 
acctggtact 
agggagacaa 
ggegagcega 
ccacgcacgt 
ccgagggtga 
agtacatcga 
eggaegtget 
tctaccgcct 
tctacgaacg 
tgatcgggtc 
cgatcctagt 
gtaeggagea 
ttcctgtgga 
acattgggaa 
aagagaaaaa 
cccgcctggc 
ctacccttcg 
ctggccgctc 
cgccgtcgcc 
gatgaeggtg 
gcggatgccg 
ggcgcagcca 
catcagagca 
taaggagaaa 
eggtegtteg 
cagaatcagg 
acegtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgetacaga 
gtatctgege 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
gtaaaatata 
gctcgacata 
cataccactt 
ctttcacaaa 
gettttcegt 
cccagttttc 
agcggctgtc 
tgatgeaetc 
ccgagcaaag 
caaagtgcag 
cccgttccac 
tttcattttc 
cgcagcggta 
attatttcct 
gacgaactcc 
ttcaaagttg 
gcggtgatca 
gatcatccgt 
atgagcaaag 
ctgcctgtat 
ggtggcagga 
cggacgtttt 



2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
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taatgtactg aattaacgcc gaattaattc gggggatctg gattttagta ctggattttg 6300 
gttttaggaa ttagaaattt tattgataga agtattttac aaatacaaat acatactaag 6360 
ggtttcttat atgctcaaca catgagcgaa accctatagg aaccctaatt cccttatctg 6420 
ggaactactc acacattatt atggagaaac tcgagtcaaa tctcggtgac gggcaggacc 6480 
ggacggggcg gtaccggcag gctgaagtcc agctgccaga aacccacgtc atgccagttc 6540 
ccgtgcttga agccggccgc ccgcagcatg ccgcgggggg catatccgag cgcctcgtgc 6600 
atgcgcacgc tcgggtcgtt gggcagcccg atgacagcga ccacgctctt gaagccctgt 6660 
gcctccaggg acttcagcag gtgggtgtag agcgtggagc ccagtcccgt ccgctggtgg 6720 
cggggggaga cgtacacggt cgactcggcc gtccagtcgt aggcgttgcg tgccttccag 6780 
gggcccgcgt aggcgatgcc ggcgacctcg ccgtccacct cggcgacgag ccagggatag 6840 
cgctcccgca gacggacgag gtcgtccgtc cactcctgcg gttcctgcgg ctcggtacgg 6900 
aagttgaccg tgcttgtctc gatgtagtgg ttgacgatgg tgcagaccgc cggcatgtcc 6960 
gcctcggtgg cacggcggat gtcggccggg cgtcgttctg ggctcatggt agactcgaga 7020 
gagatagatt tgtagagaga gactggtgat ttcagcgtgt cctctccaaa tgaaatgaac 7080 
ttccttatat agaggaaggt cttgcgaagg atagtgggat tgtgcgtcat cccttacgtc 7140 
agtggagata tcacatcaat ccacttgctt tgaagacgtg gttggaacgt cttctttttc 7200 
cacgatgctc ctcgtgggtg ggggtccatc tttgggacca ctgtcggcag aggcatcttg 7260 
aacgatagcc tttcctttat cgcaatgatg gcatttgtag gtgccacctt ccttttctac 7320 
tgtccttttg atgaagtgac agatagctgg gcaatggaat ccgaggaggt ttcccgatat 7380 
taccctttgt tgaaaagtct caatagccct ttggtcttct gagactgtat ctttgatatt 7440 
cttggagtag acgagagtgt cgtgctccac catgttatca catcaatcca cttgctttga 7500 
agacgtggtt ggaacgtctt ctttttccac gatgctcctc gtgggtgggg gtccatcttt 7560 
gggaccactg tcggcagagg catcttgaac gatagccttt cctttatcgc aatgatggca 7 620 
tttgtaggtg ccaccttcct tttctactgt ccttttgatg aagtgacaga tagctgggca 7680 
atggaatccg aggaggtttc ccgatattac cctttgttga aaagtctcaa tagccctttg 774 0 
gtcttctgag actgtatctt tgatattctt ggagtagacg agagtgtcgt gctccaccat 7800 
gttggcaagc tgctctagcc aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat 7860 
taatgcagct ggcacgacag gtttcccgac tggaaagcgg gcagtgagcg caacgcaatt 7 920 
aatgtgagtt agctcactca ttaggcaccc caggctttac actttatgct tccggctcgt 7980 
atgttgtgtg gaattgtgag cggataacaa tttcacacag gaaacagcta tgaccatgat 8040 
tacgaattcg agctcggtac ccggggatcc tctagagtcg acctgcaggc atgcaagctt 8100 
ggcactggcc gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta cccaacttaa 8160 
tcgccttgca gcacatcccc ctttcgccag ctggcgtaat agcgaagagg cccgcaccga 8220 
tcgcccttcc caacagttgc gcagcctgaa tggcgaatgc tagagcagct tgagcttgga 8280 
tcagattgtc gtttcccgcc ttcagtttaa actatcagtg tttgacagga tatattggcg 8340 
ggtaaaccta agagaaaaga gcgtttatta gaataacgga tatttaaaag ggcgtgaaaa 8400 
ggtttatccg ttcgtccatt tgtatgtg 8428 

<210> 3 
<211> 10549 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCambial302 plasmid 
<300> 

<308> Genbank #AF234298 
<309> 2000-04-24 

<400> 3 

catggtagat ctgactagta aaggagaaga acttttcact ggagttgtcc caattcttgt 60 
tgaattagat ggtgatgtta atgggcacaa attttctgtc agtggagagg gtgaaggtga 12 0 
tgcaacatac ggaaaactta cccttaaatt tatttgcact actggaaaac tacctgttcc 180 
gtggccaaca cttgtcacta ctttctctta tggtgttcaa tgcttttcaa gatacccaga 240 
tcatatgaag cggcacgact tcttcaagag cgccatgcct gagggatacg tgcaggagag 300 
gaccatcttc ttcaaggacg acgggaacta caagacacgt gctgaagtca agtttgaggg 360 
agacaccctc gtcaacagga tcgagcttaa gggaatcgat ttcaaggagg acggaaacat 420 
cctcggccac aagttggaat acaactacaa ctcccacaac gtatacatca tggccgacaa 4 80 
gcaaaagaac ggcatcaaag ccaacttcaa gacccgccac aacatcgaag acggcggcgt 540 
gcaactcgct gatcattatc aacaaaatac tccaattggc gatggccctg tccttttacc 600 
agacaaccat tacctgtcca cacaatctgc cctttcgaaa gatcccaacg aaaagagaga 660 
ccacatggtc cttcttgagt ttgtaacagc tgctgggatt acacatggca tggatgaact 720 
atacaaagct agccaccacc accaccacca cgtgtgaatt ggtgaccagc tcgaatttcc 780 
ccgatcgttc aaacatttgg caataaagtt tcttaagatt gaatcctgtt gccggtcttg 840 
cgatgattat catataattt ctgttgaatt acgttaagca tgtaataatt aacatgtaat 900 
gcatgacgtt atttatgaga tgggttttta tgattagagt cccgcaatta tacatttaat 960 
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acgcgataga aaacaaaata tagcgcgcaa actaggataa attatcgcgc gcggtgtcat 1020 
ctatgttact agatcgggaa ttaaactatc agtgtttgac aggatatatt ggcgggtaaa 1080 
cctaagagaa aagagcgttt attagaataa cggatattta aaagggcgtg aaaaggttta 1140 
tccgttcgtc catttgtatg tgcatgccaa ccacagggtt cccctcggga tcaaagtact 1200 
ttgatccaac ccctccgctg ctatagtgca gtcggcttct gacgttcagt gcagccgtct 1260 
tctgaaaacg acatgtcgca caagtcctaa gttacgcgac aggctgccgc cctgcccttt 1320 
tcctggcgtt ttcttgtcgc gtgttttagt cgcataaagt agaatacttg cgactagaac 13 80 
cggagacatt acgccatgaa caagagcgcc gccgctggcc tgctgggcta tgcccgcgtc 1440 
agcaccgacg accaggactt gaccaaccaa cgggccgaac tgcacgcggc cggctgcacc 1S00 
aagctgtttt ccgagaagat caccggcacc aggcgcgacc gcccggagct ggccaggatg 1560 
cttgaccacc tacgccctgg cgacgttgtg acagtgacca ggctagaccg cctggcccgc 1620 
agcacccgcg acctactgga cattgccgag cgcatccagg aggccggcgc gggcctgcgt 1680 
agcctggcag agccgtgggc cgacaccacc acgccggccg gccgcatggt gttgaccgtg 174 0 
ttcgccggca ttgccgagtt cgagcgttcc ctaatcatcg accgcacccg gagcgggcgc 1800 
gaggccgcca aggcccgagg cgtgaagttt ggcccccgcc ctaccctcac cccggcacag 1860 
atcgcgcacg cccgcgagct gatcgaccag gaaggccgca ccgtgaaaga ggcggctgca 1920 
ctgcttggcg tgcatcgctc gaccctgtac cgcgcacttg agcgcagcga ggaagtgacg 1980 
cccaccgagg ccaggcggcg cggtgccttc cgtgaggacg cattgaccga ggccgacgcc 2 040 
ctggcggccg ccgagaatga acgccaagag gaacaagcat gaaaccgcac caggacggcc 2100 
aggacgaacc gtttttcatt accgaagaga tcgaggcgga gatgatcgcg gccgggtacg 2160 
tgttcgagcc gcccgcgcac gtctcaaccg tgcggctgca tgaaatcctg gccggtttgt 2220 
ctgatgccaa gctggcggcc tggccggcca gcttggccgc tgaagaaacc gagcgccgcc 2280 
gtctaaaaag gtgatgtgta tttgagtaaa acagcttgcg tcatgcggtc gctgcgtata 2340 
tgatgcgatg agtaaataaa caaatacgca aggggaacgc atgaaggtta tcgctgtact 2400 
taaccagaaa ggcgggtcag gcaagacgac catcgcaacc catctagccc gcgccctgca 2460 
actcgccggg gccgatgttc tgttagtcga ttccgatccc cagggcagtg cccgcgattg 2520 
ggcggccgtg cgggaagatc aaccgctaac cgttgtcggc atcgaccgcc cgacgattga 2580 
ccgcgacgtg aaggccatcg gccggcgcga cttcgtagtg atcgacggag cgccccaggc 264 0 
ggcggacttg gctgtgtccg cgatcaaggc agccgacttc gtgctgattc cggtgcagcc 2700 
aagcccttac gacatatggg ccaccgccga cctggtggag ctggttaagc agcgcattga 2760 
ggtcacggat ggaaggctac aagcggcctt tgtcgtgtcg cgggcgatca aaggcacgcg 2820 
catcggcggt gaggttgccg aggcgctggc cgggtacgag ctgcccatt'c ttgagtcccg 2880 
tatcacgcag cgcgtgagct acccaggcac tgccgccgcc ggcacaaccg ttcttgaatc 294 0 
agaacccgag ggcgacgctg cccgcgaggt ccaggcgctg gccgctgaaa ttaaatcaaa 3000 
actcatttga gttaatgagg taaagagaaa atgagcaaaa gcacaaacac gctaagtgcc 3060 
ggccgtccga gcgcacgcag cagcaaggct gcaacgttgg ccagcctggc agacacgcca 3120 
gccatgaagc gggtcaactt tcagttgccg gcggaggatc acaccaagct gaagatgtac 3180 
gcggtacgcc aaggcaagac cattaccgag ctgctatctg aatacatcgc gcagctacca 3240 
gagtaaatga gcaaatgaat aaatgagtag atgaatttta gcggctaaag gaggcggcat 3300 
ggaaaatcaa gaacaaccag gcaccgacgc cgtggaatgc cccatgtgtg gaggaacggg 3360 
cggrttggcca ggcgtaagcg gctgggttgt ctgccggccc tgcaatggca ctggaacccc 3420 
caagcccgag gaatcggcgt gacggtcgca aaccatccgg cccggtacaa atcggcgcgg 3480 
cgctgggtga tgacctggtg gagaagttga aggccgcgca ggccgcccag cggcaacgca 3540 
tcgaggcaga agcacgcccc ggtgaatcgt ggeaagcggc cgctgatcga atccgcaaag 3 600 
aatcccggca accgccggca gccggtgcgc cgtcgattag gaagccgccc aagggcgacg 3660 
agcaaccaga ttttttcgtt ccgatgctct atgacgtggg cacccgcgat agtcgcagca 372 0 
tcatggacgt ggccgttttc cgtctgtcga agcgtgaccg acgagctggc gaggtgatcc 3 780 
gctacgagct tccagacggg cacgtagagg tttccgcagg gccggccggc atggccagtg 3 840 
tgtgggatta cgacctggta ctgatggcgg tttcccatct aaccgaatcc atgaaccgat 3900 
accgggaagg gaagggagac aagcccggcc gcgtgttccg tccacacgtt gcggacgtac 3 960 
tcaagttctg ccggcgagcc gatggcggaa agcagaaaga cgacctggta gaaacctgca 4 020 
ttcggttaaa caccacgcac gttgccatgc agcgtacgaa gaaggccaag aacggccgcc 4080 
tggtgacggt atccgagggt gaagccttga ttagccgcta caagatcgta aagagcgaaa 4140 
ccgggcggcc ggagtacatc gagatcgagc tagctgattg gatgtaccgc gagatcacag 4200 
aaggcaagaa cccggacgtg ctgacggttc accccgatta ctttttgatc gatcccggca 4260 
tcggccgttt tctctaccgc ctggcacgcc gcgccgcagg caaggcagaa gccagatggt 4320 
tgttcaagac gatctacgaa cgcagtggca gcgccggaga gttcaagaag ttctgtttca 4380 
ccgtgcgcaa gctgatcggg tcaaatgacc tgccggagta cgatttgaag gaggaggcgg 4440 
ggcaggctgg cccgatccta gtcatgcgct accgcaacct gatcgagggc gaagcatccg 4500 
ccggttccta atgtacggag cagatgctag ggcaaattgc cctagcaggg gaaaaaggtc 4560 
gaaaaggtct ctttcctgtg gatagcacgt acattgggaa cccaaagccg tacattggga 4620 
accggaaccc gtacattggg aacccaaagc cgtacattgg gaaccggtca cacatgtaag 4680 
tgactgatat aaaagagaaa aaaggcgatt tttccgccta aaactcttta aaacttatta 4740 
aaactcttaa aacccgcctg gcctgtgcat aactgtctgg ccagcgcaca gccgaagagc 4800 
tgcaaaaagc gcctaccctt cggtcgctgc gctccctacg ccccgccgct tcgcgtcggc 4 860 
ctatcgcggc cgctggccgc tcaaaaatgg ctggcctacg gccaggcaat ctaccagggc 492 0 
gcggacaagc cgcgccgtcg ccactcgacc gccggcgccc acatcaaggc accctgcctc 4980 
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gcgcgtttcg 
gcttgtctgt 
ggcgggtgtc 
ttaactatgc 
cgcacagatg 
actcgctgcg 
tacggttatc 
aaaaggccag 
ctgacgagca 
aaagatacca 
cgcttaccgg 
cacgctgtag 
aaccccccgt 
cggtaagaca 
ggtatgtagg 
ggacagtatt 
gctcttgatc 
agattacgcg 
acgctcagtg 
acaattcatc 
agtcaaaaaa 
agaaggcaat 
ttactttgcc 
gttcctcttc 
gagtgtcttc 
ccaattcggc 
agtgaaagag 
cttcatactc 
catcatgccg 
tcatgtcctt 
ttaaatatag 
ccgtatcttt 
ttttagccat 
taattataac 
gaaaacagct 
gattttgaaa 
taccctccgc 
agcatcggta 
cggactgatg 
tgttggctgg 
aataacacat 
tggattttag 
acaaatacaa 
ggaaccctaa 
gtcgatcgac 
gcgtcggttt 
tctgcgggcg 
tcgaccctgc 
gtcaagacca 
cctccgctcg 
gatgttggcg 
tgttatgcgg 
ccggacttcg 
cgcactgacg 
gcatatgaaa 
cccgctcgtc 
tagaacagcg 
ggagatgcaa 
gagcgcggcc 
gctatttacc 
ttcgccctcc 
ctcgacagac 
gaaagctcga 
aatgaaatga 
atcccttacg 
gtcttctttt 
agaggcatct 



gtgatgacgg 
aagcggatgc 
ggggcgcagc 
ggcatcagag 
cgtaaggaga 
ctcggtcgtt 
cacagaatca 
gaaccgtaaa 
tcacaaaaat 
ggcgtttccc 
atacctgtcc 
gtatctcagt 
tcagcccgac 
cgacttatcg 
cggtgctaca 
tggtatctgc 
cggcaaacaa 
cagaaaaaaa 
gaacgaaaac 
cagtaaaata 
tagctcgaca 
gtcataccac 
atctttcaca 
gggcttttcc 
ttcccagttt 
taagcggctg 
cctgatgcac 
ttccgagcaa 
ttcaaagtgc 
ttcccgttcc 
gttttcattt 
tacgcagcgg 
ttattatttc 
aagacgaact 
ttttcaaagt 
ccgcggtgat 
gagatcatcc 
acatgagcaa 
ggctgcctgt 
ctggtggcag 
tgcggacgtt 
tactggattt 
atacatacta 
ttcccttatc 
agatccggtc 
ccactatcgg 
atttgtgtac 
gcccaagctg 
atgcggagca 
aagtagcgcg 
acctcgtatt 
ccattgtccg 
gggcagtcct 
gtgtcgtcca 
tcacgccatg 
tggctaagat 
ggcagttcgg 
taggtcaggc 
gatgcaaagt 
cgcaggacat 
gagagctgca 
gtcgcggtga 
gagagataga 
acttccttat 
tcagtggaga 
tccacgatgc 
tgaacgatag 



tgaaaacctc 
cgggagcaga 
catgacccag 
cagattgtac 
aaataccgca 
cggctgcggc 
ggggataacg 
aaggccgcgt 
cgacgctcaa 
cctggaagct 
gcctttctcc 
tcggtgtagg 
cgctgcgcct 
ccactggcag 
gagttcttga 
gctctgctga 
accaccgctg 
ggatctcaag 
tcacgttaag 
taatatttta 
tactgttctt 
ttgtccgccc 
aagatgttgc 
gtctttaaaa 
tcgcaatcca 
tctaagctat 
tccgcataca 
aggacgccat 
aggacctttg 
acatcatagg 
tctcccacca 
tatttttcga 
cttcctcttt 
ccaattcact 
tgttttcaaa 
cacaggcagc 
gtgtttcaaa 
agtctgccgc 
atcgagtggt 
gatatattgt 
tttaatgtac 
tggttttagg 
agggtttctt 
tgggaactac 
ggcatctact 
cgagtacttc 
gcccgacagt 
catcatcgaa 
tatacgcccg 
tctgctgctc 
gggaatcccc 
tcaggacatt 
cggcccaaag 
tcacagtttg 
tagtgtattg 
cggccgcagc 
tttcaggcag 
tctcgctaaa 
gccgataaac 
atccacgccc 
tcaggtcgga 
gttcaggctt 
tttgtagaga 
atagaggaag 
tatcacatca 
tcctcgtggg 
cctttccttt 



tgacacatgc 
caagcccgtc 
tcacgtagcg 
tgagagtgca 
tcaggcgctc 
gagcggtatc 
caggaaagaa 
tgctggcgtt 
gtcagaggtg 
ccctcgtgcg 
cttcgggaag 
tcgttcgctc 
tatccggtaa 
cagccactgg 
agtggtggcc 
agccagttac 
gtagcggtgg 
aagatccttt 
ggattttggt 
ttttctccca 
ccccgatatc 
tgccgcttct 
tgtctcccag 
aatcatacag 
catcggccag 
tcgtataggg 
gctcgataat 
cggcctcact 
gaacaggcag 
tggtcccttt 
gcttatatac 
tcagtttttt 
tctacagtat 
gttccttgca 
gttggcgtat 
aacgctctgt 
cccggcagct 
cttacaacgg 
gattttgtgc 
ggtgtaaaca 
tgaattaacg 
aattagaaat 
atatgctcaa 
tcacacatta 
ctatttcttt 
tacacagcca 
cccggctccg 
attgccgtca 
gagtcgtggc 
catacaagcc 
gaacatcgcc 
gttggagccg 
catcagctca 
ccagtgatac 
accgattcct 
gatcgcatcc 
gtcttgcaac 
ctccccaatg 
ataacgatct 
tcctacatcg 
gacgctgtcg 
tttcatatct 
gagactggtg 
gtcttgcgaa 
atccacttgc 
tgggggtcca 
atcgcaatga 



agctcccgga 
agggcgcgtc 
atagcggagt 
ccatatgcgg 
ttccgcttcc 
agctcactca 
catgtgagca 
tttccatagg 
gcgaaacccg 
ctctcctgtt 
cgtggcgctt 
caagctgggc 
ctatcgtctt 
taacaggatt 
taactacggc 
cttcggaaaa 
tttttttgtt 
gatcttttct 
catgcattct 
atcaggcttg 
ctccctgatc 
cccaagatca 
gtcgccgtgg 
ctcgcgcgga 
atcgttattc 
acaatccgat 
cttttcaggg 
catgagcaga 
ctttccttcc 
ataccggctg 
cttagcagga 
caattccggt 
ttaaagatac 
ttctaaaacc 
aacatagtat 
catcgttaca 
tagttgccgt 
ctctcccgct 
cgagctgccg 
aattgacgct 
ccgaattaat 
tttattgata 
cacatgagcg 
ttatggagaa 
gccctcggac 
tcggtccaga 
gatcggacga 
accaagctct 
gatcctgcaa 
aaccacggcc 
tcgctccagt 
aaatccgcgt 
tcgagagcct 
acatggggat 
tgcggtccga 
atagcctccg 
gtgacaccct 
tcaagcactt 
ttgtagaaac 
aagctgaaag 
aacttttcga 
cattgccccc 
atttcagcgt 
ggatagtggg 
tttgaagacg 
tctttgggac 
tggcatttgt 



gacggtcaca 
agcgggtgtt 
gtatactggc 
tgtgaaatac 
tcgctcactg 
aaggcggtaa 
aaaggccagc 
ctccgccccc 
acaggactat 
ccgaccctgc 
tctcatagct 
tgtgtgcacg 
gagtccaacc 
agcagagcga 
tacactagaa 
agagttggta 
tgcaagcagc 
acggggtctg 
aggtactaaa 
atccccagta 
gaccggacgc 
ataaagccac 
gaaaagacaa 
tctttaaatg 
agtaagtaat 
atgtcgatgg 
ctttgttcat 
ttgctccagc 
agccatagca 
tccgtcattt 
gacattcctt 
gatattctca 
cccaagaagc 
ttaaatacca 
cgacggagcc 
atcaacatgc 
tcttccgaat 
gacgccgtcc 
gtcggggagc 
tagacaactt 
tcgggggatc 
gaagtatttt 
aaaccctata 
actcgagctt 
gagtgctggg 
cggccgcgct 
ttgcgtcgca 
gatagagttg 
gctccggatg 
tccagaagaa 
caatgaccgc 
gcacgaggtg 
gcgcgacgga 
cagcaatcgc 
atgggccgaa 
cgaccggttg 
gtgcacggcg 
ccggaatcgg 
catcggcgca 
cacgagattc 
tcagaaactt 
cgggatctgc 
gtcctctcca 
attgtgcgtc 
tggttggaac 
cactgtcggc 
aggtgccacc 



5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5B20 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
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ttccttttct actgtccttt tgatgaagtg acagatagct gggcaatgga atccgaggag 9060 
gtttcccgat attacccttt gttgaaaagt ctcaatagcc ctttggtctt ctgagactgt 9120 
atctttgata ttcttggagt agacgagagt gtcgtgctcc accatgttat cacatcaatc 9180 
cacttgcttt gaagacgtgg ttggaacgtc ttctttttcc acgatgctcc tcgtgggtgg 9240 
gggtccatct ttgggaccac tgtcggcaga ggcatcttga acgatagcct ttcctttatc 9300 
gcaatgatgg catttgtagg tgccaccttc cttttctact gtccttttga tgaagtgaca 9360 
gatagctggg caatggaatc cgaggaggtt tcccgatatt accctttgtt gaaaagtctc 9420 
aatagccctt tggtcttctg agactgtatc tttgatattc ttggagtaga cgagagtgtc 9480 
gtgctccacc atgttggcaa gctgctctag ccaatacgca aaccgcctct ccccgcgcgt 9540 
tggccgattc attaatgcag ctggcacgac aggtttcccg actggaaagc gggcagtgag 9600 
cgcaacgcaa ttaatgtgag ttagctcact cattaggcac cccaggcttt acactttatg 9660 
cttccggctc gtatgttgtg tggaattgtg agcggataac aatttcacac aggaaacagc 9720 
tatgaccatg attacgaatt cgagctcggt acccggggat cctctagagt cgacctgcag 9780 
gcatgcaagc ttggcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt 9840 
tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga 9900 
ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat gctagagcag 9960 
cttgagcttg gatcagattg tcgtttcccg ccttcagttt agcttcatgg agtcaaagat 1002 0 
tcaaatagag gacctaacag aactcgccgt aaagactggc gaacagttca tacagagtct 10080 
cttacgactc aatgacaaga agaaaatctt cgtcaacatg gtggagcacg acacacttgt 10140 
ctactccaaa aatatcaaag atacagtctc agaagaccaa agggcaattg agacttttca 10200 
acaaagggta atatccggaa acctcctcgg attccattgc ccagctatct gtcactttat 10260 
tgtgaagata gtggaaaagg aaggtggctc ctacaaatgc catcattgcg ataaaggaaa 10320 
ggccatcgtt gaagatgcct ctgccgacag tggtcccaaa gatggacccc cacccacgag 10380 
gagcatcgtg gaaaaagaag acgttccaac cacgtcttca aagcaagtgg attgatgtga 10440 
tatctccact gacgtaaggg atgacgcaca atcccactat ccttcgcaag acccttcctc 1050 0 
tatataagga agttcatttc atttggagag aacacggggg actcttgac 1054 9 

<210> 4 
<211> 33 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> CaMV35SpolyA Primer 
<400> 4 

ctgaattaac gccgaattaa ttcgggggat ctg 

<210> 5 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> CaMV35Spr Primer 



33 



<400> 5 

ctagagcagc ttgccaacat ggtggagca 2 9 

<210> 6 
<211> 12592 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAg2 Plasmid 
<400> 6 

gtacgaagaa ggccaagaac ggccgcctgg tgacggtatc cgagggtgaa gccttgatta 60 
gccgctacaa gatcgtaaag agcgaaaccg ggcggccgga gtacatcgag atcgagctag 120 
ctgattggat gtaccgcgag atcacagaag gcaagaaccc ggacgtgctg acggttcacc 180 
ccgattactt tttgatcgat cccggcatcg gccgttttct ctaccgcctg gcacgccgcg 240 
ccgcaggcaa ggcagaagcc agatggttgt tcaagacgat ctacgaacgc agtggcagcg 300 
ccggagagtt caagaagttc tgtttcaccg tgcgcaagct gatcgggtca aatgacctgc 360 
cggagtacga tttgaaggag gaggcggggc aggctggccc gatcctagtc atgcgctacc 420 
gcaacctgat cgagggcgaa gcatccgccg gttcctaatg tacggagcag atgctagggc 4 80 
aaattgccct agcaggggaa aaaggtcgaa aaggtctctt tcctgtggat agcacgtaca 540 
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ttgggaaccc aaagccgtac attgggaacc 
acattgggaa ccggtcacac atgtaagtga 
ccgcctaaaa ctctttaaaa cttattaaaa 
tgtctggcca gcgcacagcc gaagagctgc 
ccctacgccc cgccgcttcg cgtcggccta 
gcctacggcc aggcaatcta ccagggcgcg 
ggcgcccaca tcaaggcacc ctgcctcgcg 
cacatgcagc tcccggagac ggtcacagct 
gcccgtcagg gcgcgtcagc gggtgttggc 
cgtagcgata gcggagtgta tactggctta 
gagtgcacca tatgcggtgt gaaataccgc 
ggcgctcttc cgcttcctcg ctcactgact 
cggtatcagc tcactcaaag gcggtaatac 
gaaagaacat gtgagcaaaa ggccagcaaa 
tggcgttttt ccataggctc cgcccccctg 
agaggtggcg aaacccgaca ggactataaa 
tcgtgcgctc tcctgttccg accctgccgc 
cgggaagcgt ggcgctttct catagctcac 
ttcgctccaa gctgggctgt gtgcacgaac 
ccggtaacta tcgtcttgag tccaacccgg 
ccactggtaa caggattagc agagcgaggt 
ggtggcctaa ctacggctac actagaagga 
cagttacctt cggaaaaaga gttggtagct 
gcggtggttt ttttgtttgc aagcagcaga 
atcctttgat cttttctacg gggtctgacg 
ttttggtcat gcattctagg tactaaaaca 
tctcccaatc aggcttgatc cccagtaagt 
cgatatcctc cctgatcgac cggacgcaga 
cgcttctccc aagatcaata aagccactta 
ctcccaggtc gccgtgggaa aagacaagtt 
catacagctc gcgcggatct ttaaatggag 
cggccagatc gttattcagt aagtaatcca 
tatagggaca atccgatatg tcgatggagt 
cgataatctt ttcagggctt tgttcatctt 
cctcactcat gagcagattg ctccagccat 
caggcagctt tccttccagc catagcatca 
tccctttata ccggctgtcc gtcattttta 
tatatacctt agcaggagac attccttccg 
gttttttcaa ttccggtgat attctcattt 
acagtattta aagatacccc aagaagctaa 
ccttgcattc taaaacctta aataccagaa 
ggcgtataac atagtatcga cggagccgat 
gctctgtcat cgttacaatc aacatgctac 
ggcagcttag ttgccgttct tccgaatagc 
acaacggctc tcccgctgac gccgtcccgg 
tttgtgccga gctgccggtc ggggagctgt 
gtaaacaaat tgacgcttag acaacttaat 
attaacgccg aattaattcg ggggatctgg 
tagaaatttt attgatagaa gtattttaca 
tgctcaacac atgagcgaaa ccctatagga 
cacattatta tggagaaact cgagtcaaat 
taccggcagg ctgaagtcca gctgccagaa 
gccggccgcc cgcagcatgc cgcggggggc 
cgggtcgttg ggcagcccga tgacagcgac 
cttcagcagg tgggtgtaga gcgtggagcc 
gtacacggtc gactcggccg tccagtcgta 
ggcgatgccg gcgacctcgc cgtccacctc 
acggacgagg tcgtccgtcc actcctgcgg 
gcttgtctcg atgtagtggt tgacgatggt 
acggcggatg tcggccgggc gtcgttctgg 
gtagagagag actggtgatt tcagcgtgtc 
gaggaaggtc ttgcgaagga tagtgggatt 
cacatcaatc cacttgcttt gaagacgtgg 
tcgtgggtgg gggtccatct ttgggaccac 
ttcctttatc gcaatgatgg catttgtagg 
tgaagtgaca gatagctggg caatggaatc 
gaaaagtctc aatagccctt tggtcttctg 



-10- 

ggaacccgta cattgggaac ccaaagccgt 600 
ctgatataaa agagaaaaaa ggcgattttt 660 
ctcttaaaac ccgcctggcc tgtgcataac 720 
aaaaagcgcc tacccttcgg tcgctgcgct 780 
tcgcggccgc tggccgctca aaaatggctg 840 
gacaagccgc gccgtcgcca ctcgaccgcc 900 
cgtttcggtg atgacggtga aaacctctga 960 
tgtctgtaag cggatgccgg gagcagacaa 1020 
Sggtgtcggg gcgcagccat gacccagtca 1080 
actatgcggc atcagagcag attgtactga 1140 
acagatgcgt aaggagaaaa taccgcatca 12 00 
cgctgcgctc ggtcgttcgg ctgcggcgag 12 60 
ggttatccac agaatcaggg gataacgcag 132 0 
aggccaggaa ccgtaaaaag gccgcgttgc 1380 
acgagcatca caaaaatcga cgctcaagtc 1440 
gataccaggc gtttccccct ggaagctccc 1500 
ttaccggata cctgtccgcc tttctccctt 1560 
gctgtaggta tctcagttcg gtgtaggtcg 1620 
cccccgttca gcccgaccgc tgcgccttat 1680 
taagacacga cttatcgcca ctggcagcag 1740 
atgtaggcgg tgctacagag ttcttgaagt 1800 
cagtatttgg tatctgcgct ctgctgaagc 1860 
cttgatccgg caaacaaacc accgctggta 192 0 
ttacgcgcag aaaaaaagga tctcaagaag 1980 
ctcagtggaa cgaaaactca cgttaaggga 2040 
attcatccag taaaatataa tattttattt 2100 
caaaaaatag ctcgacatac tgttcttccc 2160 
aggcaatgtc ataccacttg tccgccctgc 222 0 
ctttgccatc tttcacaaag atgttgctgt 2280 
cctcttcggg cttttccgtc tttaaaaaat 2340 
tgtcttcttc ccagttttcg caatccacat 24 00 
attcggctaa gcggctgtct aagctattcg 2460 
gaaagagcct gatgcactcc gcatacagct 2520 
catactcttc cgagcaaagg acgccatcgg 2580 
catgccgttc aaagtgcagg acctttggaa 2640 
tgtccttttc ccgttccaca tcataggtgg 2700 
aatataggtt ttcattttct cccaccagct 2760 
tatcttttac gcagcggtat ttttcgatca 2820 
tagccattta ttatttcctt cctcttttct 2880 
ttataacaag acgaactcca attcactgtt 294 0 
aacagctttt tcaaagttgt tttcaaagtt 3000 
tttgaaaccg cggtgatcac aggcagcaac 3 060 
cctccgcgag atcatccgtg tttcaaaccc 3120 
atcggtaaca tgagcaaagt ctgccgcctt 3180 
actgatgggc tgcctgtatc gagtggtgat 3240 
tggctggctg gtggcaggat atattgtggt 3 300 
aacacattgc ggacgttttt aatgtactga 3 360 
attttagtac tggattttgg ttttaggaat 3420 
aatacaaata catactaagg gtttcttata 3480 
accctaattc ccttatctgg gaactactca 3 540 
ctcggtgacg ggcaggaccg gacggggcgg 3 600 
acccacgtca tgccagttcc cgtgcttgaa 3660 
atatccgagc gcctcgtgca tgcgcacgct 3720 
cacgctcttg aagccctgtg cctccaggga 3780 
cagtcccgtc cgctggtggc ggggggagac 3 840 
ggcgttgcgt gccttccagg ggcccgcgta 3 900 
ggcgacgagc cagggatagc gctcccgcag 3 960 
ttcctgcggc tcggtacgga agttgaccgt 4 020 
gcagaccgcc ggcatgtccg cctcggtggc 4 080 
gctcatggta gactcgagag agatagattt 414 0 
ctctccaaat gaaatgaact tccttatata 4200 
gtgcgtcatc ccttacgtca gtggagatat 4260 
ttggaacgtc ttctttttcc acgatgctcc 4320 
tgtcggcaga ggcatcttga acgatagcct 4380 
tgccaccttc cttttctact gtccttttga 4440 
cgaggaggtt tcccgatatt accctttgtt 4500 
agactgtatc tttgatattc ttggagtaga 4560 
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cgagagtgtc gtgctccacc atgttatcac atcaatccac ttgctttgaa gacgtggttg 4620 
gaacgtcttc tttttccacg atgctcctcg tgggtggggg tccatctttg ggaccactgt 4680 
cggcagaggc atcttgaacg atagcctttc ctttatcgca atgatggcat ttgtaggtgc 474 0 
caccttcctt ttctactgtc cttttgatga agtgacagat agctgggcaa tggaatccga 4800 
ggaggtttcc cgatattacc ctttgttgaa aagtctcaat agccctttgg tcttctgaga 4860 
ctgtatcttt gatattcttg gagtagacga gagtgtcgtg ctccaccatg ttggcaagct 4920 
gctctagcca atacgcaaac cgcctctccc cgcgcgttgg ccgattcatt aatgcagctg 4980 
gcacgacagg tttcccgact ggaaagcggg cagtgagcgc aacgcaatta atgtgagtta 5040 
gctcactcat taggcacccc aggctttaca ctttatgctt ccggctcgta tgttgtgtgg 5100 
aattgtgagc ggataacaat ttcacacagg aaacagctat gaccatgatt acgaattcga 5160 
gccttgacta gagggtcgac ggtatacaga catgataaga tacattgatg agtttggaca 5220 
aaccacaact agaatgcagt gaaaaaaatg ctttatttgt gaaatttgtg atgctattgc 5280 
tttatttgta accattataa gctgcaataa acaagttggg gtgggcgaag aactccagca 5340 
tgagatcccc gcgctggagg atcatccagc cggcgtcccg gaaaacgatt ccgaagccca 5400 
acctttcata gaaggcggcg gtggaatcga aatctcgtag cacgtgtcag tcctgctcct 5460 
cggccacgaa gtgcacgcag ttgccggccg ggtcgcgcag ggcgaactcc cgcccccacg 5520 
gctgctcgcc gatctcggtc atggccggcc cggaggcgtc ccggaagttc gtggacacga 5580 
cctccgacca ctcggcgtac agctcgtcca ggccgcgcac ccacacccag gccagggtgt 564 0 
tgtccggcac cacctggtcc tggaccgcgc tgatgaacag ggtcacgtcg tcccggacca 5700 
caccggcgaa gtcgtcctcc acgaagtccc gggagaaccc gagccggtcg gtccagaact 5760 
cgaccgctcc ggcgacgtcg cgcgcggtga gcaccggaac ggcactggtc aacttggcca 5820 
tggatccaga tttcgctcaa gttagtataa aaaagcaggc ttcaatcctg caggaattcg 5880 
atcgacactc tcgtctactc caagaatatc aaagatacag tctcagaaga ccaaagggct 5940 
attgagactt ttcaacaaag ggtaatatcg ggaaacctcc tcggattcca ttgcccagct 6000 
atctgtcact tcatcaaaag gacagtagaa aaggaaggtg gcacctacaa atgccatcat 6060 
tgcgataaag gaaaggctat cgttcaagat gcctctgccg acagtggtcc caaagatgga 6120 
cccccaccca cgaggagcat cgtggaaaaa gaagacgttc caaccacgtc ttcaaagcaa 6180 
gtggattgat gtgataacat ggtggagcac gacactctcg tctactccaa gaatatcaaa 6240 
gatacagtct cagaagacca aagggctatt gagacttttc aacaaagggt aatatcggga 6300 
aacctcctcg gattccattg cccagctatc tgtcacttca tcaaaaggac agtagaaaag 63 60 
gaaggtggca cctacaaatg ccatcattgc gataaaggaa aggctatcgt tcaagatgcc 6420 
tctgccgaca gtggtcccaa agatggaccc ccacccacga ggagcatcgt ggaaaaagaa 6480 
gacgttccaa ccacgtcttc aaagcaagtg gattgatgtg atatctccac tgacgtaagg 6540 
gatgacgcac aatcccacta tccttcgcaa gaccttcctc tatataagga agttcatttc 6600 
atttggagag gacacgctga aatcaccagt ctctctctac aaatctatct ctctcgagct 6660 
ttcgcagatc cgggggggca atgagatatg aaaaagcctg aactcaccgc gacgtctgtc 6720 
gagaagtttc tgatcgaaaa gttcgacagc gtctccgacc tgatgcagct ctcggagggc 6780 
gaagaatctc gtgctttcag cttcgatgta ggagggcgtg gatatgtcct, gcgggtaaat 6840 
agctgcgccg atggtttcta caaagatcgt tatgtttatc ggcactttgc atcggccgcg 6900 
ctcccgattc cggaagtgct tgacattggg gagtttagcg agagcctgac ctattgcatc 6960 
tcccgccgtg cacagggtgt cacgttgcaa gacctgcctg aaaccgaact gcccgctgtt 7020 
ctacaaccgg tcgcggaggc tatggatgcg atcgctgcgg ccgatcttag ccagacgagc 7080 
gggttcggcc cattcggacc gcaaggaatc ggtcaataca ctacatggcg tgatttcata 7140 
tgcgcgattg ctgatcccca tgtgtatcac tggcaaactg tgatggacga caccgtcagt 72 00 
gcgtccgtcg cgcaggctct cgatgagctg atgctttggg ccgaggactg ccccgaagtc 7260 
cggcacctcg tgcacgcgga tttcggctcc aacaatgtcc tgacggacaa tggccgcata 7320 
acagcggtca ttgactggag cgaggcgatg ttcggggatt cccaatacga ggtcgccaac 73 80 
atcttcttct ggaggccgtg gttggcttgt atggagcagc agacgcgcta cttcgagcgg 7440 
aggcatccgg agcttgcagg atcgccacga ctccgggcgt atatgctccg cattggtctt 7500 
gaccaactct atcagagctt ggttgacggc aatttcgatg atgcagcttg ggcgcagggt 7560 
cgatgcgacg caatcgtccg atccggagcc gggactgtcg ggcgtacaca aatcgcccgc 7620 
agaagcgcgg ccgtctggac cgatggctgt gtagaagtac tcgccgatag tggaaaccga 7680 
cgccccagca ctcgtccgag ggcaaagaaa tagagtagat gccgaccgga tctgtcgatc 774 0 
gacaagctcg agtttctcca taataatgtg tgagtagttc ccagataagg gaattagggt 7800 
tcctataggg tttcgctcat gtgttgagca tataagaaac ccttagtatg tatttgtatt 7860 
tgtaaaatac ttctatcaat aaaatttcta attcctaaaa ccaaaatcca gtactaaaat 7920 
ccagatcccc cgaattaatt cggcgttaat tcagatcaag cttggcactg gccgtcgttt 7980 
tacaacgtcg tgactgggaa aaccctggcg ttacccaact taatcgcctt gcagcacatc 8040 
cccctttcgc cagctggcgt aatagcgaag aggcccgcac cgatcgccct tcccaacagt 8100 
tgcgcagcct gaatggcgaa tgctagagca gcttgagctt ggatcagatt gtcgtttccc 8160 
gccttcagtt tggggatcct ctagactgaa ggcgggaaac gacaatctga tcatgagcgg 8220 
agaattaagg gagtcacgtt atgacccccg ccgatgacgc gggacaagcc gttttacgtt 8280 
tggaactgac agaaccgcaa cgttgaagga gccactcagc cgcgggtttc tggagtttaa 8340 
tgagctaagc acatacgtca gaaaccatta ttgcgcgttc aaaagtcgcc taaggtcact 8400 
atcagctagc aaatatttct tgtcaaaaat gctccactga cgttccataa attcccctcg 8460 
gtatccaatt agagtctcat attcactctc aatccaaata atctgcaccg gatctcgaga 8520 
atcgaattcc cgcggccgcc atggtagatc tgactagtaa aggagaagaa cttttcactg 8580 
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gagttgtccc 
gtggagaggg 
ctggaaaact 
gcttttcaag 
agggatacgt 
ctgaagtcaa 
tcaaggagga 
tatacatcat 
acatcgaaga 
atggccctgt 
atcccaacga 
cacatggcat 
gtgaccagct 
aatcctgttg 
gtaataatta 
ccgcaattat 
ttatcgcgcg 
ggatatattg 
aagggcgtga 
ccctcgggat 
acgttcagtg 
ggctgccgcc 
gaatacttgc 
gctgggctat 
gcacgcggcc 
cccggagctg 
gctagaccgc 
ggccggcgcg 
ccgcatggtg 
ccgcacccgg 
taccctcacc 
cgtgaaagag 
gcgcagcgag 
attgaccgag 
aaaccgcacc 
atgatcgcgg 
gaaatcctgg 
gaagaaaccg 
catgcggtcg 
tgaaggttat 
atctagcccg 
agggcagtgc 
tcgaccgccc 
tcgacggagc 
tgctgattcc 
tggttaagca 
gggcgatcaa 
tgcccattct 
gcacaaccgt 
ccgctgaaat 
cacaaacacg 
cagcctggca 
caccaagctg 
atacatcgcg 
cggctaaagg 
ccatgtgtgg 
gcaatggcac 
ccggtacaaa 
gccgcccagc 
gctgatcgaa 
aagccgccca 
acccgcgata 
cgagctggcg 
ccggccggca 
accgaatcca 
ccacacgttg 
gacctggtag 



aattcttgtt 
tgaaggtgat 
acctgttccg 
atacccagat 
gcaggagagg 
gtttgaggga 
cggaaacatc 
ggccgacaag 
cggcggcgtg 
ccttttacca 
aaagagagac 
ggatgaacta 
cgaatttccc 
ccggtcttgc 
acatgtaatg 
acatttaata 
cggtgtcatc 
gcgggtaaac 
aaaggtttat 
caaagtactt 
cagccgtctt 
ctgccctttt 
gactagaacc 
gcccgcgtca 
ggctgcacca 
gccaggatgc 
ctggcccgca 
ggcctgcgta 
ttgaccgtgt 
agcgggcgcg 
ccggcacaga 
gcggctgcac 
gaagtgacgc 
gccgacgccc 
aggacggcca 
ccgggtacgt 
ccggtttgtc 
agcgccgccg 
ctgcgtatat 
cgctgtactt 
cgccctgcaa 
ccgcgattgg 
gacgattgac 
gccccaggcg 
ggtgcagcca 
gcgcattgag 
aggcacgcgc 
tgagtcccgt 
tcttgaatca 
taaatcaaaa 
ctaagtgccg 
gacacgccag 
aagatgtacg 
cagctaccag 
aggcggcatg 
aggaacgggc 
tggaaccccc 
tcggcgcggc 
ggcaacgcat 
tccgcaaaga 
agggcgacga 
gtcgcagcat 
aggtgatccg 
tggccagtgt 
tgaaccgata 
cggacgtact 
aaacctgcat 



gaattagatg 
gcaacatacg 
tggccaacac 
catatgaagc 
accatcttct 
gacaccctcg 
ctcggccaca 
caaaagaacg 
caactcgctg 
gacaaccatt 
cacatggtcc 
tacaaagcta 
cgatcgttca 
gatgattatc 
catgacgtta 
cgcgatagaa 
tatgttacta 
ctaagagaaa 
ccgttcgtcc 
tgatccaacc 
ctgaaaacga 
cctggcgttt 
ggagacatta 
gcaccgacga 
agctgttttc 
ttgaccacct 
gcacccgcga 
gcctggcaga 
tcgccggcat 
aggccgccaa 
tcgcgcacgc 
tgcttggcgt 
ccaccgaggc 
tggcggccgc 
ggacgaaccg 
gttcgagccg 
tgatgccaag 
tctaaaaagg 
gatgcgatga 
aaccagaaag 
ctcgccgggg 
gcggccgtgc 
cgcgacgtga 
gcggacttgg 
agcccttacg 
gtcacggatg 
atcggcggtg 
atcacgcagc 
gaacccgagg 
ctcatttgag 
gccgtccgag 
ccatgaagcg 
cggtacgcca 
agtaaatgag 
gaaaatcaag 
ggttggccag 
aagcccgagg 
gctgggtgat 
cgaggcagaa 
atcccggcaa 
gcaaccagat 
catggacgtg 
ctacgagctt 
gtgggattac 
ccgggaaggg 
caagttctgc 
tcggttaaac 



gtgatgttaa 
gaaaacttac 
ttgtcactac 
ggcacgactt 
tcaaggacga 
tcaacaggat 
agttggaata 
gcatcaaagc 
atcattatca 
acctgtccac 
ttcttgagtt 
gccaccacca 
aacatttggc 
atataatttc 
tttatgagat 
aacaaaatat 
gatcgggaat 
agagcgttta 
atttgtatgt 
cctccgctgc 
catgtcgcac 
tcttgtcgcg 
cgccatgaac 
ccaggacttg 
cgagaagatc 
acgccctggc 
cctactggac 
gccgtgggcc 
tgccgagttc 
ggcccgaggc 
ccgcgagctg 
gcatcgctcg 
caggcggcgc 
cgagaatgaa 
tttttcatta 
cccgcgcacg 
ctggcggcct 
tgatgtgtat 
gtaaataaac 
gcgggtcagg 
ccgatgttct 
gggaagatca 
aggccatcgg 
ctgtgtccgc 
acatatgggc 
gaaggctaca 
aggttgccga 
gcgtgagcta 
gcgacgctgc 
ttaatgaggt 
cgcacgcagc 
ggtcaacttt 
aggcaagacc 
caaatgaata 
aacaaccagg 
gcgtaagcgg 
aatcggcgtg 
gacctggtgg 
gcacgccccg 
ccgccggcag 
tttttcgttc 
gccgttttcc 
ccagacgggc 
gacctggtac 
aagggagaca 
cggcgagccg 
accacgcacg 



tgggcacaaa 
ccttaaattt 
tttctcttat 
cttcaagagc 
cgggaactac 
cgagcttaag 
caactacaac 
caacttcaag 
acaaaatact 
acaatctgcc 
tgtaacagct 
ccaccaccac 
aataaagttt 
tgttgaatta 
gggtttttat 
agcgcgcaaa 
taaactatca 
ttagaataac 
gcatgccaac 
tatagtgcag 
aagtcctaag 
tgttttagtc 
aagagcgccg 
accaaccaac 
accggcacca 
gacgttgtga 
attgccgagc 
gacaccacca 
gagcgttccc 
gtgaagtttg 
atcgaccagg 
accctgtacc 
ggtgccttcc 
cgccaagagg 
ccgaagagat 
tctcaaccgt 
ggccggccag 
ttgagtaaaa 
aaatacgcaa 
caagacgacc 
gttagtcgat 
accgctaacc 
ccggcgcgac 
gatcaaggca 
caccgccgac 
agcggccttt 
ggcgctggcc 
cccaggcact 
ccgcgaggtc 
aaagagaaaa 
agcaaggctg 
cagttgccgg 
attaccgagc 
aatgagtaga 
caccgacgcc 
ctgggttgtc 
acggtcgcaa 
agaagttgaa 
gtgaatcgtg 
ccggtgcgcc 
cgatgctcta 
gtctgtcgaa 
acgtagaggt 
tgatggcggt 
agcccggccg 
atggcggaaa 
ttgccatgca 



ttttctgtca 
atttgcacta 
ggtgttcaat 
gccatgcctg 
aagacacgtg 
ggaatcgatt 
tcccacaacg 
acccgccaca 
ccaattggcg 
ctttcgaaag 
gctgggatta 
gtgtgaattg 
cttaagattg 
cgttaagcat 
gattagagtc 
ctaggataaa 
gtgtttgaca 
ggatatttaa 
cacagggttc 
tcggcttctg 
ttacgcgaca 
gcataaagta 
ccgctggcct 
gggccgaact 
ggcgcgaccg 
cagtgaccag 
gcatccagga 
cgccggccgg 
taatcatcga 
gcccccgccc 
aaggccgcac 
gcgcacttga 
gtgaggacgc 
aacaagcatg 
cgaggcggag 
gcggctgcat 
cttggccgct 
cagcttgcgt 

ggggaacgca 

atcgcaaccc 
tccgatcccc 
gttgtcggca 
ttcgtagtga 
gccgacttcg 
ctggtggagc 
gtcgtgtcgc 
gggtacgagc 
gccgccgccg 
caggcgctgg 
tgagcaaaag 
caacgttggc 
cggaggatca 
tgctatctga 
tgaattttag 
gtggaatgcc 
tgccggccct 
accatccggc 
ggccgcgcag 
gcaagcggcc 
gtcgattagg 
tgacgtgggc 
gcgtgaccga 
ttccgcaggg 
ttcccatcta 
cgtgttccgt 
gcagaaagac 
gc 



8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10140 

10200 

10260 

10320 

10380 

10440 

10500 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12592 
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<210> 7 

<211> 3357 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pGEMEasyNOS Plasmid 



<400> 7 

tatcactagt 

tggatgcata 

tagctgtttc 

agcataaagt 

cgctcactgc 

caacgcgcgg 

tcgctgcgct 

cggttatcca 

aaggccagga 

gacgagcatc 

agataccagg 

cttaccggat 

cgctgtaggt 

ccccccgttc 

gtaagacacg 

tatgtaggcg 

acagtatttg 

tcttgatccg 

attacgcgca 

gctcagtgga 

ttcacctaga 

taaacttggt 

ctatttcgtt 

ggcttaccat 

gatttatcag 

ttatccgcct 

gttaatagtt 

tttggtatgg 

atgttgtgca 

gccgcagtgt 

tccgtaagat 

atgcggcgac 

agaactttaa 

ttaccgctgt 

tcttttactt 

aagggaataa 

tgaagcattt 

aataaacaaa 

aataccgcac 

ttgttaaaat 

atcggcaaaa 

gtttggaaca 

gtctatcagg 

aggtgccgta 

ggaaagccgg 

gcgctggcaa 

ccgctacagg 

tgcgggcctc 

gttgggtaac 

aatacgactc 

gccgcgggaa 

gactctaatt 

atatttgcta 

gtatgtgctt 

ggttctgtca 

tgactccctt 



gaattcgcgg 
gcttgagtat 
ctgtgtgaaa 
gtaaagcctg 
ccgctttcca 
ggagaggcgg 
cggtcgttcg 
cagaatcagg 
accgtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgctacaga 
gtatctgcgc 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
tccttttaaa 
ctgacagtta 
catccatagt 
ctggccccag 
caataaacca 
ccatccagtc 
tgcgcaacgt 
cttcattcag 
aaaaagcggt 
tatcactcat 
gcttttctgt 
cgagttgctc 
aagtgctcat 
tgagatccag 
tcaccagcgt 
gggcgacacg 
atcagggtta 
taggggttcc 
agatgcgtaa 
tcgcgttaaa 
tcccttataa 
agagtccact 
gcgatggccc 
aagcactaaa 
cgaacgtggc 
gtgtagcggt 
gcgcgtccat 
ttcgctatta 
gccagggttt 
actatagggc 
ttcgattctc 
ggataccgag 
gctgatagtg 
agctcattaa 
gttccaaacg 
aattctccgc 



ccgcctgcag 
tctatagtgt 
ttgttatccg 
gggtgcctaa 
gtcgggaaac 
tttgcgtatt 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
ttaaaaatga 
ccaatgctta 
tgcctgactc 
tgctgcaatg 
gccagccgga 
tattaattgt 
tgttgccatt 
ctccggttcc 
tagctccttc 
ggttatggca 
gactggtgag 
ttgcccggcg 
cattggaaaa 
ttcgatgtaa 
ttctgggtga 
gaaatgttga 
ttgtctcatg 
gcgcacattt 
ggagaaaata 
tttttgttaa 
atcaaaagaa 
attaaagaac 
actacgtgaa 
tcggaaccct 
gagaaaggaa 
cacgctgcgc 
tcgccattca 
cgccagctgg 
tcccagtcac 
gaattgggcc 
gagatccggt 
gggaatttat 
accttaggcg 
actccagaaa 
taaaacggct 
tcatgatcag 



gtcgaccata 
cacctaaata 
ctcacaattc 
tgagtgagct 
ctgtcgtgcc 
gggcgctctt 
gcggtatcag 
ggaaagaaca 
ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
agttttaaat 
atcagtgagg 
cccgtcgtgt 
ataccgcgag 
agggccgagc 
tgccgggaag 
gctacaggca 
caacgatcaa 
ggtcctccga 
gcactgcata 
tactcaacca 
tcaatacggg 
cgttcttcgg 
cccactcgtg 
gcaaaaacag 
atactcatac 
agcggataca 
ccccgaaaag 
ccgcatcagg 
atcagctcat 
tagaccgaga 
gtggactcca 
ccatcaccct 
aaagggagcc 
gggaagaaag 
gtaaccacca 
ggctgcgcaa 
cgaaaggggg 
gacgttgtaa 
cgacgtcgca 
gcagattatt 
ggaacgtcag 
acttttgaac 
cccgcggctg 
tgtcccgcgt 
attgtcgttt 



tgggagagct 
gcttggcgta 
cacacaacat 
aactcacatt 
agctgcatta 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttttgtttg 
tcttttctac 
tgagattatc 
caatctaaag 
cacctatctc 
agataactac 
acccacgctc 
gcagaagtgg 
ctagagtaag 
tcgtggtgtc 
ggcgagttac 
tcgttgtcag 
attctcttac 
agtcattctg 
ataataccgc 
ggcgaaaact 
cacccaactg 
gaaggcaaaa 
tcttcctttt 
tatttgaatg 
tgccacctga 
aaattgtaag 
tttttaacca 
tagggttgag 
acgtcaaagg 
aatcaagttt 
cccgatttag 
cgaaaggagc 
cacccgccgc 
ctgttgggaa 
atgtgctgca 
aacgacggcc 
tgctcccggc 
tggattgaga 
tggagcattt 
gcgcaataat 
agtggctcct 
catcggcggg 
cccgccttca 



cccaacgcgt 
atcatggtca 
acgagccgga 
aattgcgttg 
atgaatcggc 
gctcactgac 
ggcggtaata 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaaga 
agttggtagc 
caagcagcag 
ggggtctgac 
aaaaaggatc 
tatatatgag 
agcgatctgt 
gatacgggag 
accggctcca 
tcctgcaact 
tagttcgcca 
acgctcgtcg 
atgatccccc 
aagtaagttg 
tgtcatgcca 
agaatagtgt 
gccacatagc 
ctcaaggatc 
atcttcagca 
tgccgcaaaa 
tcaatattat 
tatttagaaa 
tgcggtgtga 
cgttaatatt 
ataggccgaa 
tgttgttcca 
gcgaaaaacc 
tttggggtcg 
agcttgacgg 
gggcgctagg 
gcttaatgcg 
gggcgatcgg 
aggcgattaa 
agtgaattgt 
cgccatggcg 
gtgaatatga 
ttgacaagaa 
ggtttctgac 
tcaacgttgc 
ggtcataacg 
gtctaga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3357 
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<211> 10122 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pl302NOS Plasmid 
<400> 8 

catggtagat ctgactagta aaggagaaga acttttcact ggagttgtcc caattcttgt 60 
tgaattagat ggtgatgtta atgggcacaa attttctgtc agtggagagg gtgaaggtga 120 
tgcaacatac ggaaaactta cccttaaatt tatttgcact actggaaaac tacctgttcc 180 
gtggccaaca cttgtcacta ctttctctta tggtgttcaa tgcttttcaa gatacccaga 240 
tcatatgaag cggcacgact tcttcaagag cgccatgcct gagggatacg tgcaggagag 3 00 
gaccatcttc ttcaaggacg acgggaacta caagacacgt gctgaagtca agtttgaggg 360 
agacaccctc gtcaacagga tcgagcttaa gggaatcgat ttcaaggagg acggaaacat 420 
cctcggccac aagttggaat acaactacaa ctcccacaac gtatacatca tggccgacaa 480 
gcaaaagaac ggcatcaaag ccaacttcaa gacccgccac aacatcgaag acggcggcgt 540 
gcaactcgct gatcattatc aacaaaatac tccaattggc gatggccctg tccttttacc 600 
agacaaccat tacctgtcca cacaatctgc cctttcgaaa gatcccaacg aaaagagaga 660 
ccacatggtc cttcttgagt ttgtaacagc tgctgggatt acacatggca tggatgaact 720 
atacaaagct agccaccacc accaccacca cgtgtgaatt ggtgaccagc tcgaatttcc 780 
ccgatcgttc aaacatttgg caataaagtt tcttaagatt gaatcctgtt gccggtcttg 840 
cgatgattat catataattt ctgttgaatt acgttaagca tgtaataatt aacatgtaat 900 
gcatgacgtt atttatgaga tgggttttta tgattagagt cccgcaatta tacatttaat 960 
acgcgataga aaacaaaata tagcgcgcaa actaggataa attatcgcgc gcggtgtcat 1020 
ctatgttact agatcgggaa ttaaactatc agtgtttgac aggatatatt ggcgggtaaa 1080 
cctaagagaa aagagcgttt attagaataa cggatattta aaagggcgtg aaaaggttta 1140 
tccgttcgtc catttgtatg tgcatgccaa ccacagggtt cccctcggga tcaaagtact 1200 
ttgatccaac ccctccgctg ctatagtgca gtcggcttct gacgttcagt gcagccgtct 12 60 
tctgaaaacg acatgtcgca caagtcctaa gttacgcgac aggctgccgc cctgcccttt 1320 
tcctggcgtt ttcttgtcgc gtgttttagt cgcataaagt agaatacttg cgactagaac 13 80 
cggagacatt acgccatgaa caagagcgcc gccgctggcc tgctgggcta* tgcccgcgtc 1440 
agcaccgacg accaggactt gaccaaccaa cgggccgaac tgcacgcggc cggctgcacc 1500 
aagctgtttt ccgagaagat caccggcacc aggcgcgacc gcccggagct ggccaggatg 1560 
cttgaccacc tacgccctgg cgacgttgtg acagtgacca ggctagaccg cctggcccgc 1620 
agcacccgcg acctactgga cattgccgag cgcatccagg aggccggcgc gggcctgcgt 1680 
agcctggcag agccgtgggc cgacaccacc acgccggccg gccgcatggt gttgaccgtg 1740 
ttcgccggca ttgccgagtt cgagcgttcc ctaatcatcg accgcacccg gagcgggcgc 1800 
gaggccgcca aggcccgagg cgtgaagttt ggcccccgcc ctaccctcac cccggcacag 1860 
atcgcgcacg cccgcgagct gatcgaccag gaaggccgca ccgtgaaaga ggcggctgca 1920 
ctgcttggcg tgcatcgctc gaccctgtac cgcgcacttg agcgcagcga ggaagtgacg 1980 
cccaccgagg ccaggcggcg cggtgccttc cgtgaggacg cattgaccga ggccgacgcc 2040 
ctggcggccg ccgagaatga acgccaagag gaacaagcat gaaaccgcac caggacggcc 2100 
aggacgaacc gtttttcatt accgaagaga tcgaggcgga gatgatcgcg gccgggtacg 2160 
tgttcgagcc gcccgcgcac gtctcaaccg tgcggctgca tgaaatcctg gccggtttgt 2220 
ctgatgccaa gctggcggcc tggccggcca gcttggccgc tgaagaaacc gagcgccgcc 22 80 
gtctaaaaag gtgatgtgta tttgagtaaa acagcttgcg tcatgcggtc gctgcgtata 2340 
tgatgcgatg agtaaataaa caaatacgca aggggaacgc atgaaggtta tcgctgtact 2400 
taaccagaaa ggcgggtcag gcaagacgac catcgcaacc catctagccc gcgccctgca 2460 
actcgccggg cjccgatgttc tgttagtcga ttccgatccc cagggcagtg cccgcgattg 2520 
ggcggccgtg cgggaagatc aaccgctaac cgttgtcggc atcgaccgcc cgacgattga 2580 
ccgcgacgtg aaggccatcg gccggcgcga cttcgtagtg atcgacggag cgccccaggc 2640 
ggcggacttg gctgtgtccg cgatcaaggc agccgacttc gtgctgattc cggtgcagcc 2700 
aagcccttac gacatatggg ccaccgccga cctggtggag ctggttaagc agcgcattga 2760 
ggtcacggat ggaaggctac aagcggcctt tgtcgtgtcg cgggcgatca aaggcacgcg 2820 
catcggcggt gaggttgccg aggcgctggc cgggtacgag ctgcccattc ttgagtcccg 2880 
tatcacgcag cgcgtgagct acccaggcac tgccgccgcc ggcacaaccg ttcttgaatc 2940 
agaacccgag ggcgacgctg cccgcgaggt ccaggcgctg gccgctgaaa ttaaatcaaa 3 000 
actcatttga gttaatgagg taaagagaaa atgagcaaaa gcacaaacac gctaagtgcc 3060 
ggccgtccga gcgcacgcag cagcaaggct gcaacgttgg ccagcctggc agacacgcca 3120 
gccatgaagc gggtcaactt tcagttgccg gcggaggatc acaccaagct gaagatgtac 3180 
gcggtacgcc aaggcaagac cattaccgag ctgctatctg aatacatcgc gcagctacca 3240 
gagtaaatga gcaaatgaat aaatgagtag atgaatttta gcggctaaag gaggcggcat 3300 
ggaaaatcaa gaacaaccag gcaccgacgc cgtggaatgc cccatgtgtg gaggaacggg 3360 
cggttggcca ggcgtaagcg gctgggttgt ctgccggccc tgcaatggca ctggaacccc 3420 
caagcccgag gaatcggcgt gacggtcgca aaccatccgg cccggtacaa atcggcgcgg 3480 
cgctgggtga tgacctggtg gagaagttga aggccgcgca ggccgcccag cggcaacgca 3540 
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tcgaggcaga agcacgcccc ggtgaatcgt 
aatcccggca accgccggca gccggtgcgc 
agcaaccaga ttttttcgtt ccgatgctct 
tcatggacgt ggccgttttc cgtctgtcga 
gctacgagct tccagacggg cacgtagagg 
tgtgggatta cgacctggta ctgatggcgg 
accgggaagg gaagggagac aagcccggcc 
tcaagttctg ccggcgagcc gatggcggaa 
ttcggttaaa caccacgcac gttgccatgc 
tggtgacggt atccgagggt gaagccttga 
ccgggcggcc ggagtacatc gagatcgagc 
aaggcaagaa cccggacgtg ctgacggttc 
tcggccgttt tctctaccgc ctggcacgcc 
tgttcaagac gatctacgaa cgcagtggca 
ccgtgcgcaa gctgatcggg tcaaatgacc 
ggcaggctgg cccgatccta gtcatgcgct 
ccggttccta atgtacggag cagatgctag 
gaaaaggtct ctttcctgtg gatagcacgt 
accggaaccc gtacattggg aacccaaagc 
tgactgatat aaaagagaaa aaaggcgatt 
aaactcttaa aacccgcctg gcctgtgcat 
tgcaaaaagc gcctaccctt cggtcgctgc 
ctatcgcggc cgctggccgc tcaaaaatgg 
gcggacaagc cgcgccgtcg ccactcgacc 
gcgcgtttcg gtgatgacgg tgaaaacctc 
gcttgtctgt aagcggatgc cgggagcaga 
ggcgggtgtc ggggcgcagc catgacccag 
ttaactatgc ggcatcagag cagattgtac 
cgcacagatg cgtaaggaga aaataccgca 
actcgctgcg ctcggtcgtt cggctgcggc 
tacggttatc cacagaatca ggggataacg 
aaaaggccag gaaccgtaaa aaggccgcgt 
ctgacgagca tcacaaaaat cgacgctcaa 
aaagatacca ggcgtttccc cctggaagct 
cgcttaccgg atacctgtcc gcctttctcc 
cacgctgtag gtatctcagt tcggtgtagg 
aaccccccgt tcagcccgac cgctgcgcct 
cggtaagaca cgacttatcg ccactggcag 
ggtatgtagg cggtgctaca gagttcttga 
ggacagtatt tggtatctgc gctctgctga 
gctcttgatc cggcaaacaa accaccgctg 
agattacgcg cagaaaaaaa ggatctcaag 
acgctcagtg gaacgaaaac tcacgttaag 
acaattcatc cagtaaaata taatatttta 
agtcaaaaaa tagctcgaca tactgttctt 
agaaggcaat gtcataccac ttgtccgccc 
ttactttgcc atctttcaca aagatgttgc 
gttcctcttc gggcttttcc gtctttaaaa 
gagtgtcttc ttcccagttt tcgcaatcca 
ccaattcggc taagcggctg tctaagctat 
agtgaaagag cctgatgcac tccgcataca 
cttcatactc ttccgagcaa aggacgccat 
catcatgccg ttcaaagtgc aggacctttg 
tcatgtcctt ttcccgttcc acatcatagg 
ttaaatatag gttttcattt tctcccacca 
ccgtatcttt tacgcagcgg tatttttcga 
ttttagccat ttattatttc cttcctcttt 
taattataac aagacgaact ccaattcact 
gaaaacagct ttttcaaagt tgttttcaaa 
gattttgaaa ccgcggtgat cacaggcagc 
taccctccgc gagatcatcc gtgtttcaaa 
agcatcggta acatgagcaa agtctgccgc 
cggactgatg ggctgcctgt atcgagtggt 
tgttggctgg ctggtggcag gatatattgt 
aataacacat tgcggacgtt tttaatgtac 
tggattttag tactggattt tggttttagg 
acaaatacaa atacatacta agggtttctt 



-15- 

ggcaagcggc cgctgatcga atccgcaaag 3600 
cgtcgattag gaagccgccc aagggcgacg 3660 
atgacgtggg cacccgcgat agtcgcagca 3720 
agcgtgaccg acgagctggc gaggtgatcc 3780 
tttccgcagg gccggccggc atggccagtg 3840 
tttcccatct aaccgaatcc atgaaccgat 3900 
gcgtgttccg tccacacgtt gcggacgtac 3 960 
agcagaaaga cgacctggta gaaacctgca 4020 
agcgtacgaa gaaggccaag aacggccgcc 4080 
ttagccgcta caagatcgta aagagcgaaa 4140 
tagctgattg gatgtaccgc gagatcacag 4200 
accccgatta ctttttgatc gatcccggca 4260 
gcgccgcagg caaggcagaa gccagatggt 4320 
gcgccggaga gttcaagaag ttctgtttca 4380 
tgccggagta cgatttgaag gaggaggcgg 4440 
accgcaacct gatcgagggc gaagcatccg 4500 
ggcaaattgc cctagcaggg gaaaaaggtc 4560 
acattgggaa cccaaagccg tacattggga 4620 
cgtacattgg gaaccggtca cacatgtaag 4680 
tttccgccta aaactcttta aaacttatta 4740 
aactgtctgg ccagcgcaca gccgaagagc 4 800 
gctccctacg ccccgccgct tcgcgtcggc 4 860 
ctggcctacg gccaggcaat ctaccagggc 4 920 
gccggcgccc acatcaaggc accctgcctc 4 980 
tgacacatgc agctcccgga gacggtcaca 5040 
caagcccgtc agggcgcgtc agcgggtgtt 5100 
tcacgtagcg atagcggagt gtatactggc 5160 
tgagagtgca ccatatgcgg tgtgaaatac 5220 
tcaggcgctc ttccgcttcc tcgctcactg 5280 
gagcggtatc agctcactca aaggcggtaa 5340 
caggaaagaa catgtgagca aaaggccagc 5400 
tgctggcgtt tttccatagg ctccgccccc 5460 
gtcagaggtg gcgaaacccg acaggactat 5520 
ccctcgtgcg ctctcctgtt ccgaccctgc 5580 
cttcgggaag cgtggcgctt tctcatagct 564 0 
tcgttcgctc caagctgggc tgtgtgcacg 5700 
tatccggtaa ctatcgtctt gagtccaacc 5760 
cagccactgg taacaggatt agcagagcga 5820 
agtggtggcc taactacggc tacactagaa 5880 
agccagttac cttcggaaaa agagttggta 5940 
gtagcggtgg tttttttgtt tgcaagcagc 6000 
aagatccttt gatcttttct acggggtctg 6060 
ggattttggt catgcattct aggtactaaa 6120 
ttttctccca atcaggcttg atccccagta 6180 
ccccgatatc ctccctgatc gaccggacgc 6240 
tgccgcttct cccaagatca ataaagccac 6300 
tgtctcccag gtcgccgtgg gaaaagacaa 6360 
aatcatacag ctcgcgcgga tctttaaatg 6420 
catcggccag atcgttattc agtaagtaat 6480 
tcgtataggg acaatccgat atgtcgatgg 6540 
gctcgataat cttttcaggg ctttgttcat 6600 
cggcctcact catgagcaga ttgctccagc 6660 
gaacaggcag ctttccttcc agccatagca 6720 
tggtcccttt ataccggctg tccgtcattt 6780 
gcttatatac cttagcagga gacattcctt 6840 
tcagtttttt caattccggt gatattctca 6900 
tctacagtat ttaaagatac cccaagaagc 6960 
gttccttgca ttctaaaacc ttaaatacca 7020 
gttggcgtat aacatagtat cgacggagcc 7080 
aacgctctgt catcgttaca atcaacatgc 7140 
cccggcagct tagttgccgt tcttccgaat 7200 
cttacaacgg ctctcccgct gacgccgtcc 72 60 
gattttgtgc cgagctgccg gtcggggagc 7320 
ggtgtaaaca aattgacgct tagacaactt 7380 
tgaattaacg ccgaattaat tcgggggatc 7440 
aattagaaat tttattgata gaagtatttt 7500 
atatgctcaa cacatgagcg aaaccctata 7560 
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ggaaccctaa 
gtcgatcgac 
gcgtcggttt 
tctgcgggcg 
tcgaccctgc 
gtcaagacca 
cctccgctcg 
gatgttggcg 
tgttatgcgg 
ccggacttcg 
cgcactgacg 
gcatatgaaa 
cccgctcgtc 
tagaacagcg 
ggagatgcaa 
gagcgcggcc 
gctatttacc 
ttcgccctcc 
ctcgacagac 
gaaagctcga 
aatgaaatga 
atcccttacg 
gtcttctttt 
agaggcatct 
ttccttttct 
gtttcccgat 
atctttgata 
cacttgcttt 
gggtccatct 
gcaatgatgg 
gatagctggg 
aatagccctt 
gtgctccacc 
tggccgattc 
cgcaacgcaa 
cttccggctc 
tatgaccatg 
aacgacaatc 
cgcgggacaa 
agccgcgggt 
ttcaaaagtc 
tgacgttcca 
ataatctgca 



ttcccttatc 
agatccggtc 
ccactatcgg 
atttgtgtac 
gcccaagctg 
atgcggagca 
aagtagcgcg 
acctcgtatt 
ccattgtccg 
gggcagtcct 
gtgtcgtcca 
tcacgccatg 
tggctaagat 
ggcagttcgg 
taggtcaggc 
gatgcaaagt 
cgcaggacat 
gagagctgca 
gtcgcggtga 
gagagataga 
acttccttat 
tcagtggaga 
tccacgatgc 
tgaacgatag 
actgtccttt 
attacccttt 
ttcttggagt 
gaagacgtgg 
ttgggaccac 
catttgtagg 
caatggaatc 
tggtcttctg 
atgttggcaa 
attaatgcag 
ttaatgtgag 
gtatgttgtg 
attacgaatt 
tgatcatgag 
gccgttttac 
ttctggagtt 
gcctaaggtc 
taaattcccc 
ccggatctcg 



tgggaactac 
ggcatctact 
cgagtacttc 
gcccgacagt 
catcatcgaa 
tatacgcccg 
tctgctgctc 
gggaatcccc 
tcaggacatt 
cggcccaaag 
tcacagtttg 
tagtgtattg 
cggccgcagc 
tttcaggcag 
tctcgctaaa 
gccgataaac 
atccacgccc 
tcaggtcgga 
gttcaggctt 
tttgtagaga 
atagaggaag 
tatcacatca 
tcctcgtggg 
cctttccttt 
tgatgaagtg 
gttgaaaagt 
agacgagagt 
ttggaacgtc 
tgtcggcaga 
tgccaccttc 
cgaggaggtt 
agactgtatc 
gctgctctag 
ctggcacgac 
ttagctcact 
tggaattgtg 
cgagctcggt 
cggagaatta 
gtttggaact 
taatgagcta 
actatcagct 
tcggtatcca 
agaatcgaat 



tcacacatta 
ctatttcttt 
tacacagcca 
cccggctccg 
attgccgtca 
gagtcgtggc 
catacaagcc 
gaacatcgcc 
gttggagccg 
catcagctca 
ccagtgatac 
accgattcct 
gatcgcatcc 
gtcttgcaac 
ctccccaatg 
ataacgatct 
tcctacatcg 
gacgctgtcg 
tttcatatct 
gagactggtg 
gtcttgcgaa 
atccacttgc 
tgggggtcca 
atcgcaatga 
acagatagct 
ctcaatagcc 
gtcgtgctcc 
ttctttttcc 
ggcatcttga 
cttttctact 
tcccgatatt 
tttgatattc 
ccaatacgca 
aggtttcccg 
cattaggcac 
agcggataac 
acccggggat 
agggagtcac 
gacagaaccg 
agcacatacg 
agcaaatatt 
attagagtct 
tcccgcggcc 



ttatggagaa 
gccctcggac 
tcggtccaga 
gatcggacga 
accaagctct 
gatcctgcaa 
aaccacggcc 
tcgctccagt 
aaatccgcgt 
tcgagagcct 
acatggggat 
tgcggtccga 
atagcctccg 
gtgacaccct 
tcaagcactt 
ttgtagaaac 
aagctgaaag 
aacttttcga 
cattgccccc 
atttcagcgt 
ggatagtggg 
tttgaagacg 
tctttgggac 
tggcatttgt 
gggcaatgga 
ctttggtctt 
accatgttat 
acgatgctcc 
acgatagcct 
gtccttttga 
accctttgtt 
ttggagtaga 
aaccgcctct 
actggaaagc 
cccaggcttt 
aatttcacac 
cctctagact 
gttatgaccc 
caacgttgaa 
tcagaaacca 
tcttgtcaaa 
catattcact 
gc 



actcgagctt 
gagtgctggg 
cggccgcgct 
ttgcgtcgca 
gatagagttg 
gctccggatg 
tccagaagaa 
caatgaccgc 
gcacgaggtg 
gcgcgacgga 
cagcaatcgc 
atgggccgaa 
cgaccggttg 
gtgcacggcg 
ccggaatcgg 
catcggcgca 
cacgagattc 
tcagaaactt 
ccggatctgc 
gtcctctcca 
attgtgcgtc 
tggttggaac 
cactgtcggc 
aggtgccacc 
atccgaggag 
ctgagactgt 
cacatcaatc 
tcgtgggtgg 
ttcctttatc 
tgaagtgaca 
gaaaagtctc 
cgagagtgtc 
ccccgcgcgt 
gggcagtgag 
acactttatg 
aggaaacagc 
gaaggcggga 
ccgccgatga 
ggagccactc 
ttattgcgcg 
aatgctccac 
ctcaatccaa 



7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10122 



<210> 9 

<211> 621 

<212> DNA 

<213> Artificial Sequence 



<220> 
<223> N. 



tabacum rDNA intergnic spacer (IGS) sequence 



<300> 

<308> Genbank #Y08422 
<309> 1997-10-31 



<400> 9 

gtgctagcca atgtttaaca agatgtcaag 
gctggcggtg gtggaaaatt gcggtggttc 
tgcagcggtg tttgatatcg gaatcactta 
gttattggtg gttggtcatc tatatatttt 
ttacatattt tttattaaat ttatgcattg 
tgttttataa aatattttat tattttatgt 
ttctccattg ttttttctat atttataata 
attttttcgt tttataataa atatttatta 
tttacaatgt ttaaaagtca tttgtgaata 
tttggtgttg tacatgtcta ttatgattct 



cacaatgaat gttggtggtt ggtggtcgtg 60 
gagcggtagt gatcggcgat ggttggtgtt 120 
tggtggttgt cacaatggag gtgcgtcatg 180 
tataataata ttaagtattt tacctatttt 240 
tttgtatttt taaatagttt ttatcgtact 300 
gttatattat tacttgatgt attggaaatt 360 
attttcttat ttttttttgt tttattatgt 420 
aaaaaaatat tatttttgta aaatatatca 480 
tattagctaa gttgtacttc tttttgtgca 540 
ctggccaaaa catgtctact cctgtcactt 600 
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gggttttttt ttttaagaca t 621 

<210> 10 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer NTIGS-F1 
<400> 10 

gtgctagcca atgtttaaca agatg 25 

<210> 11 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer NTIGS-RI 
<400> 11 

atgtcttaaa aaaaaaaacc caagtgac 28 

<210> 12 
<211> 233 
<212> DNA 

<213> Mus musculus 
<300> 

<308> Genbank &V00846 
<309> 1989-07-06 

<400> 12 

gacctggaat atggcgagaa aactgaaaat cacggaaaat gagaaataca cactttagga 60 

cgtgaaatat ggcgaggaaa actgaaaaag gtggaaaatt tagaaatgtc cactgtagga 120 

cgtggaatat ggcaagaaaa ctgaaaatca tggaaaatga gaaacatcca cttgacgact 180 

tgaaaaatga cgaaatcact aaaaaacgtg aaaaatgaga aatgcacact gaa 233 

<210> 13 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer MSAT-F1 
<400> 13 

aataccgcgg aagcttgacc tggaatatcg c 31 

<210> 14 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer MSAT-RI 
<400> 14 

ataaccgcgg agtccttcag tgtgcat 27 

<210> 15 
<211> 277 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Nopal ine Synthase Promoter Fragment 
<300> 

<308> Genebank #U09365 

<309> 1997-10-17 



<400> .15 

gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc 60 
tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat 120 
aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca 180 
attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc 240 
gcgcgcggtg tcatctatgt tactagatcg ggaattc 277 

<210> 16 
<211> 1812 
<212> DNA 

<213> Escherichia coli 

<220> 

<221> CDS 

<222> (1) . . . (1812) 

<223> Beta-glucuronidase 

<300> 

<308> Genbank #S69414 
<309> 1994-09-23 

<400> 16 

atg tta cgt cct gta gaa acc cca acc cgt gaa ate aaa aaa etc gac 
Met Leu Arg Pro Val Glu Thr Pro Thr Arg Glu lie Lys Lys Leu Asp 
15 10 15 

ggc ctg tgg gca ttc agt ctg gat cgc gaa aac tgt gga att gat cag 
Gly Leu Trp Ala Phe Ser Leu Asp Arg Glu Asn Cys Gly lie Asp Gin 
20 25 30 

cgt tgg tgg gaa age gcg tta caa gaa age egg gca att get gtg cca 
Arg Trp Trp Glu Ser Ala Leu Gin Glu Ser Arg Ala He Ala Val Pro 
35 40 45 

ggc agt ttt aac gat cag ttc gee gat gca gat att cgt aat tat gcg 
Gly Ser Phe Asn Asp Gin Phe Ala Asp Ala Asp He Arg Asn Tyr Ala 
50 55 60 

ggc aac .gtc tgg tat cag cgc gaa gtc ttt ata ccg aaa ggt tgg gca 
Gly Asn Val Trp Tyr Gin Arg Glu Val Phe He Pro Lys Gly Trp Ala 
65 70 75 80 

ggc cag cgt ate gtg ctg cgt ttc gat gcg gtc act cat tac ggc aaa 
Gly Gin Arg He Val Leu Arg Phe Asp Ala Val Thr His Tyr Gly Lys 
85 90 95 

gtg tgg gtc aat aat cag gaa gtg atg gag cat cag ggc ggc tat acg 
Val Trp Val Asn Asn Gin Glu Val Met Glu His Gin Gly Gly Tyr Thr 
100 105 HO 

cca ttt gaa gee gat gtc acg ccg tat gtt att gee ggg aaa agt gta 
Pro Phe Glu Ala Asp Val Thr Pro Tyr Val He Ala Gly Lys Ser Val 
115 120 125 



48 



96 



144 



192 



240 



288 



336 



384 



cgt ate acc gtt tgt gtg aac aac gaa ctg aac tgg cag act ate ccg 432 
Arg He Thr Val Cys Val Asn Asn Glu Leu Asn Trp Gin Thr He Pro 
130 135 140 

ccg gga atg gtg att acc gac gaa aac ggc aag aaa aag cag tct tac 480 
Pro Gly Met Val He Thr Asp Glu Asn Gly Lys Lys Lys Gin Ser Tyr 
145 150 155 160 
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ttc cat gat ttc ttt aac tat gcc gga ate cat cgc age gta atg etc 
Phe His Asp Phe Phe Asn Tyr Ala Gly He His Arg Ser Val Met Leu 
165 170 175 



528 



tac ace acg ccg aac acc tgg gtg gac gat ate acc gtg gtg acg cat 576 
Tyr Thr Thr Pro Asn Thr Trp Val Asp Asp He Thr Val Val Thr His 
180 185 190 

gtc gcg caa gac tgt aac cac gcg tct gtt gac tgg cag gtg gtg gcc 624 
Val Ala Gin Asp Cys Asn His Ala Ser Val Asp Trp Gin Val Val Ala 
195 200 205 

aat ggt gat gtc age gtt gaa ctg cgt gat gcg gat caa cag gtg gtt 672 
Asn Gly Asp Val Ser Val Glu Leu Arg Asp Ala Asp Gin Gin Val Val 
210 215 220 

gca act gga caa ggc act age ggg act ttg caa gtg gtg aat ccg cac 720 
Ala Thr Gly Gin Gly Thr Ser Gly Thr Leu Gin Val Val Asn Pro His 
225 230 235 240 

etc tgg caa ccg ggt gaa ggt tat etc tat gaa ctg tgc gtc aca gcc 768 
Leu Trp Gin Pro Gly Glu Gly Tyr Leu Tyr Glu Leu Cys Val Thr Ala 
245 250 255 

aaa age cag aca gag tgt gat ate tac ccg ctt cgc gtc ggc ate egg 816 
Lys Ser Gin Thr Glu Cys Asp He Tyr Pro Leu Arg Val Gly He Arg 
260 265 270 

tea gtg gca gtg aag ggc gaa cag ttc ctg att aac cac aaa ccg ttc 864 
Ser Val Ala Val Lys Gly Glu Gin Phe Leu He Asn His Lys Pro Phe 
275 280 285 

tac ttt act ggc ttt ggt cgt cat gaa gat gcg gac ttg cgt ggc aaa 912 
Tyr Phe Thr Gly Phe Gly Arg His Glu Asp Ala Asp Leu Arg Gly Lys 
290 295 300 

gga ttc gat aac gtg ctg atg gtg cac gac cac gca tta atg gac tgg 960 
Gly Phe Asp Asn Val Leu Met Val His Asp His Ala Leu Met Asp Trp 
305 310 315 320 

att ggg gcc aac tec tac cgt acc teg cat tac cct tac get gaa gag 1008 
He Gly Ala Asn Ser Tyr Arg Thr Ser His Tyr Pro Tyr Ala Glu Glu 
325 330 335 

atg etc gac tgg gca gat gaa cat ggc ate gtg gtg att gat gaa act 1056 
Met Leu Asp Trp Ala Asp Glu His Gly He Val Val He Asp Glu Thr 
340 345 350 

get get gtc ggc ttt aac etc tct tta ggc att ggt ttc gaa gcg ggc 1104 
Ala Ala Val Gly Phe Asn Leu Ser Leu Gly He Gly Phe Glu Ala Gly 
355 360 365 

aac aag ccg aaa gaa ctg tac age gaa gag gca gtc aac ggg gaa act 1152 
Asn Lys Pro Lys Glu Leu Tyr Ser Glu Glu Ala Val Asn Gly Glu Thr 
370 375 380 

cag caa gcg cac tta cag gcg att aaa gag ctg ata gcg cgt gac aaa 120 0 
Gin Gin Ala His Leu Gin Ala He Lys Glu Leu He Ala Arg Asp Lys 
385 390 395 400 

aac cac cca age gtg gtg atg tgg agt att gcc aac gaa ccg gat acc 124 8 
Asn His Pro Ser Val Val Met Trp Ser He Ala Asn Glu Pro Asp Thr 
405 410 415 

cgt ccg caa ggt gca egg gaa tat ttc gcg cca ctg gcg gaa gca acg 1296 
Arg Pro Gin Gly Ala Arg Glu Tyr Phe Ala Pro Leu Ala Glu Ala Thr 
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420 425 430 

cgt aaa etc gac ccg acg cgt ccg ate acc tgc gtc aat gta atg ttc 
Arg Lys Leu Asp Pro Thr Arg Pro lie Thr Cys Val Asn Val Met Phe 
435 440 445 



ggt gaa cag gta tgg aat ttc gec gat ttt gcg acc teg caa ggc ata 
Gly Glu Gin Val Trp Asn Phe Ala Asp Phe Ala Thr Ser Gin Gly lie 
545 550 555 560 

ttg cgc gtt ggc ggt aac aag aaa ggg ate ttc act cgc gac cgc aaa 
Leu Arg Val Gly Gly Asn Lys Lys Gly lie Phe Thr Arg Asp Arg Lys 
565 570 575 



1344 



tgc gac get cac acc gat acc ate age gat etc ttt gat gtg ctg tgc 13 92 
Cys Asp Ala His Thr Asp Thr He Ser Asp Leu Phe Asp Val Leu Cys 
450 455 460 

ctg aac cgt tat tac gga tgg tat gtc caa age ggc gat ttg gaa acg 1440 
Leu Asn Arg Tyr Tyr Gly Trp Tyr Val Gin Ser Gly Asp Leu Glu Thr 
465 470 475 480 

gca gag aag gta ctg gaa aaa gaa ctt ctg gec tgg cag gag aaa ctg 14 88 
Ala Glu Lys Val Leu Glu Lys Glu Leu Leu Ala Trp Gin Glu Lys Leu 
485 490 495 

cat cag ccg att ate ate acc gaa tac ggc gtg gat acg tta gec ggg 1536 
His Gin Pro He He lie Thr Glu Tyr Gly Val Asp Thr Leu Ala Gly 
500 505 510 

ctg cac tea atg tac acc gac atg tgg agt gaa gag tat cag tgt gca 1584 
Leu His Ser Met Tyr Thr Asp Met Trp Ser Glu Glu Tyr Gin Cys Ala 
515 520 525 

tgg ctg gat atg tat cac cgc gtc ttt gat cgc gtc age gec gtc gtc 1632 
Trp Leu Asp Met Tyr His Arg Val Phe Asp Arg Val Ser Ala Val Val 
530 535 540 



1680 



1728 



ccg aag teg gcg get ttt ctg ctg caa aaa cgc tgg act ggc atg aac 1776 
Pro Lys Ser Ala Ala Phe Leu Leu Gin Lys Arg Trp Thr Gly Met Asn 
580 585 590 

ttc ggt gaa aaa ccg cag cag gga ggc aaa caa tga 1812 
Phe Gly Glu Lys Pro Gin Gin Gly Gly Lys Gin * 
595 600 

<210> 17 
<211> 603 
<212> PRT 

<213> Escherichia coli 
<300> 

<308> Genbank #S69414 
<309> 1994-09-23 

<400> 17 

Met Leu Arg Pro Val Glu Thr Pro Thr Arg Glu He Lys Lys Leu Asp 

15 10 15 

Gly Leu Trp Ala Phe Ser Leu Asp Arg Glu Asn Cys Gly He Asp Gin 

20 25 30 

Arg Trp Trp Glu Ser Ala Leu Gin Glu Ser Arg Ala He Ala Val Pro 

35 40 45 

Gly Ser Phe Asn Asp Gin Phe Ala Asp Ala Asp He Arg Asn Tyr Ala 

50 55 60 

Gly Asn Val Trp Tyr Gin Arg Glu Val Phe He Pro Lys Gly Trp Ala 
65 70 75 80 
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Gly 


Gin 


Arg 


He 


Val 


Leu 


Arg 


Pne 










85 








Val 


Trp 


Val 


Asn 


Asn 


Gin 


Glu 


Val 








100 










Pro 


Phe 


Glu 


Ala 


Asp 


Val 


Thr 


Pro 






115 








120 


Arg 


lie 


Thr 


Val 


Cys 


Val 


Asn 


Asn 




130 










135 




Pro 


Gly 


Met 


Val 


He 


Thr 


Asp 


Glu 


145 










150 






Phe 


His 


Asp 


Phe 


Phe 


Asn 


Tyr 


Ala 










165 








Tyr 


Thr 


Thr 


Pro 


Asn 


Thr 


Trp 


Val 






180 










Val 


Ala 


Gin 


Asp 


Cys 


Asn 


His 


Ala 






195 








200 


Asn 


Gly 


Asp 


Val 


Ser 


Val 


Glu 


Leu 




210 










215 




Ala 


Thr 


Gly 


Gin 


Gly 


Thr 


Ser 


Gly 


225 










230 






Leu 


Trp 


Gin 


Pro 


Gly 


Glu 


Gly 


Tyr 










245 








Lys 


Ser 


Gin 


Thr 


Glu 


Cys 


Asp 


He 








260 










Ser 


Val 


Ala 


Val 


Lys 


Gly 


Glu 


Gin 






275 










280 


Tyr 


Phe 


Thr 


Gly 


Phe 


Gly 


Arg 


His 




290 










295 




Gly 


Phe 


Asp 


Asn 


Val 


Leu 


Met 


Val 


305 










310 






lie 


Gly 


Ala 


Asn 


Ser 


Tyr 


Arg 


Thr 










325 








Met 


Leu 


Asp 


Trp 


Ala 


Asp 


Glu 


His 








340 










Ala 


Ala 


Val 


Gly 


Phe 


Asn 


Leu 


Ser 






355 








360 


Asn 


Lys 


Pro 


Lys 


Glu 


Leu 


Tyr 


Ser 




370 










375 




Gin 


Gin 


Ala 


His 


Leu 


Gin 


Ala 


He 


385 










390 






Asn 


His 


Pro 


Ser 


Val 


Val 


Met 


Trp 










405 








Arg 


Pro 


Gin 


Gly 


Ala 


Arg 


Glu 


Tyr 








420 










Arg 


Lys 


Leu 


Asp 


Pro 


Thr 


Arg 


Pro 






435 










440 


Cys 


Asp 


Ala 


His 


Thr 


Asp 


Thr 


lie 




450 










455 




Leu 


Asn 


Arg 


Tyr 


Tyr 


Gly 


Trp 


Tyr 


465 










470 






Ala 


Glu 


Lys 


Val 


Leu 


Glu 


Lys 


Glu 










485 








His 


Gin 


Pro 


He 


He 


He 


Thr 


Glu 








500 










Leu 


His 


Ser 


Met 


Tyr 


Thr 


Asp 


Met 






515 










520 


Trp 


Leu 


Asp 


Met 


Tyr 


His 


Arg 


Val 




530 










535 




Gly 


Glu 


Gin 


Val 


Trp 


Asn 


Phe 


Ala 


545 










550 






Leu 


Arg 


Val 


Gly 


Gly 


Asn 


Lys 


Lys 










565 








Pro 


Lys 


Ser 


Ala 


Ala 


Phe 


Leu 


Leu 






580 










Phe 


Gly 


Glu 


Lys 


Pro 


Gin 


Gin 


Gly 






595 










600 
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Asp 


Ala 


Val 


Thr 


His 


Tyr 


Gly 


Lys 




90 










95 




Met 


Glu 


His 


Gin 


Gly 


Gly Tyr 


Thr 


105 










110 






Tyr 


Val 


He 


Ala 


Gly 


Lys 


Ser 


Val 










125 








Glu 


Leu 


Asn 


Trp 


Gin 


Thr 


He 


Pro 








140 










Asn 


Gly Lys 


Lys 


Lys 


Gin 


Ser 


Tyr 






155 










160 


Gly 


He 


His 


Arg 


Ser 


Val 


Met 


Leu 


170 










175 




Asp 


Asp 


He 


Thr 


Val 


Val 


Thr 


His 


185 










190 






Ser 


Val 


Asp 


Trp 


Gin 


Val 


Val 


Ala 










205 








Arg 


Asp 


Ala 


Asp 


Gin 


Gin 


Val 


Val 








220 










Thr 


Leu 


Gin 


Val 


Val 


Asn 


Pro 


His 






235 










240 


Leu 


Tyr 


Glu 


Leu 


Cys 


Val 


Thr 


Ala 




250 










255 




Tyr 


Pro 


Leu 


Arg 


Val 


Gly 


He 


Arg 


265 










270 






Phe 


Leu 


He 


Asn 


His 


Lys 


Pro 


Phe 










285 








Glu 


Asp 


Ala 


Asp 


Leu 


Arg 


Gly 


Lys 








300 










His 


Asp 


His 


Ala 


Leu 


Met 


Asp 


Trp 






315 










320 


Ser 


His 


Tyr 


Pro 


Tyr 


Ala 


Glu 


Glu 




330 










335 




Gly 


He 


Val 


Val 


He 


Asp 


Glu 


Thr 


345 










350 






Leu 


Gly 


He 


Gly 


Phe 


Glu 


Ala 


Gly 










365 








Glu 


Glu 


Ala 


Val 


Asn Gly Glu 


Thr 








380 










Lys 


Glu 


Leu 


He 


Ala 


Arg 


Asp 


Lys 






395 










400 


Ser 


He 


Ala 


Asn 


Glu 


Pro 


Asp 


Thr 




410 










415 




Phe 


Ala 


Pro 


Leu 


Ala 


Glu 


Ala 


Thr 


425 










430 






He 


Thr 


Cys 


Val 


Asn 


Val 


Met 


Phe 










445 








Ser 


Asp 


Leu 


Phe 


Asp 


Val 


Leu 


Cys 








460 










Val 


Gin Ser Gly 


Asp 


Leu 


Glu 


Thr 






475 










480 


Leu 


Leu 


Ala 


Trp 


Gin 


Glu 


Lys 


Leu 




490 










495 




Tyr 


Gly Val 


Asp 


Thr 


Leu 


Ala 


Gly 


505 










510 






Trp 


Ser 


Glu 


Glu 


Tyr 


Gin 


Cys 


Ala 










525 








Phe 


Asp 


Arg 


Val 


Ser 


Ala 


Val 


Val 








540 










Asp 


Phe 


Ala 


Thr 


Ser 


Gin Gly 


He 




555 










560 


Gly 


He 


Phe 


Thr 


Arg 


Asp 


Arg 


Lys 




570 










575 




Gin 


Lys 


Arg 


Trp 


Thr 


Gly Met 


Asn 


585 










590 






Gly 


Lys 


Gin 
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<210> 18 
<211> 277 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Nopaline Synthase Terminator Sequence 
<300> 

<308> Genbank #U09365 
<309> 1995-10-17 

<400> 18 

gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc 60 
tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat 120 
aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca 180 
attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc 240 
gcgcgcggtg tcatctatgt tactagatcg ggaattc 277 

<210> 19 
<211> 3438 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pLIT3 8attBZeo Plasmid 



<400> 19 

tcgaccctct 

gtcgtgactg 

tcgccagctg 

gcctgaatgg 

gttaactacg 

tttctaaata 

ataatattga 

ttttgcggca 

tgctgaagat 

gatccttgag 

gctatgtggc 

acactattct 

tggcatgaca 

caacttactt 

gggggatcat 

cgacgagcgt 

tggcgaacta 

agttgcagga 

tggagccggt 

ctcccgtatc 

acagatcgct 

ctcatatata 

aagattgtat 

aatttttgtt 

aaatcaaaag 

ctattaaaga 

ccactacgtg 

aatcggaacc 

gaaaggaagg 

cgctgcgcgt 

atctaggtga 

ttccactgag 

ctgcgcgtaa 

ccggatcaag 

ccaaatactg 

ccgcctacat 

tcgtgtctta 

tgaacggggg 

tacctacagc 



agtcaaggcc 
ggaaaaccct 
gcgtaatagc 
cgaatggcgc 
tcaggtggca 
cattcaaata 
aaaaggaaga 
ttttgccttc 
cagttgggtg 
agttttcgcc 
gcggtattat 
cagaatgact 
gtaagagaat 
ctgacaacga 
gtaactcgcc 
gacaccacga 
cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
aagcaaatat 
aaatcagctc 
aatagcccga 
acgtggactc 
aaccatcacc 
ctaaagggag 
gaagaaagcg 
aaccaccaca 
agatcctttt 
cgtcagaccc 
tctgctgctt 
agctaccaac 
ttcttctagt 
acctcgctct 
ccgggttgga 
gttcgtgcac 
gtgagctatg 



ttaagtgagt 
ggcgttaccc 
gaagaggccc 
ttcgcttggt 
cttttcgggg 
tgtatccgct 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
ccgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 
tcggaggacc 
ttgatcgttg 
tgcctgtagc 
cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaccccg 
ttaaattgta 
attttttaac 
gatagggttg 
caacgtcaaa 
caaatcaagt 
cccccgattt 
aaaggagcgg 
cccgccgcgc 
tgataatctc 
cgtagaaaag 
gcaaacaaaa 
tctttttccg 
gtagccgtag 
gctaatcctg 
ctcaagacga 
acagcccagc 
agaaagcgcc 



cgtattacgg 
aacttaatcg 
gcaccgatcg 
aataaagccc 
aaatgtgcgc 
catgagacaa 
tcaacatttc 
tcacccagaa 
ttacatcgaa 
ttctccaatg 
cgccgggcaa 
ctcaccagtc 
tgccataacc 
gaaggagcta 
ggaaccggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
gttgataatc 
aacgttaata 
caataggccg 
agtgttgttc 
gggcgaaaaa 
tttttggggt 
agagcttgac 
gcgctagggc 
ttaatgcgcc 
atgaccaaaa 
atcaaaggat 
aaaccaccgc 
aaggtaactg 
ttaggccacc 
ttaccagtgg 
tagttaccgg 
ttggagcgaa 
acgcttcccg 



actggccgtc 
ccttgcagca 
cccttcccaa 
gcttcggcgg 
ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gagcaactcg 
acagaaaagc 
atgagtgata 
accgcttttt 
ctgaatgaag 
acgttgcgca 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
agaaaagccc 
ttttgttaaa 
aaatcggcaa 
cagtttggaa 
ccgtctatca 
cgaggtgccg 
ggggaaagcg 
gctggcaagt 
gctacagggc 
tcccttaacg 
cttcttgaga 
taccagcggt 
gcttcagcag 
acttcaagaa 
ctgctgccag 
ataaggcgca 
cgacctacac 
aagggagaaa 



gttttacaac 
catccccctt 
cagttgcgca 
gctttttttt 
tttgtttatt 
aaatgcttca 
ttattccctt 
aagtaaaaga 
acagcggtaa 
ttaaagttct 
gtcgccgcat 
atcttacgga 
acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 
aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
caaaaacagg 
attcgcgtta 
aatcccttat 
caagagtcca 
gggcgatggc 
taaagcacta 
aacgtggcga 
gtagcggtca 
gcgtaaaagg 
tgagttttcg 
tccttttttt 
ggtttgtttg 
agcgcagata 
ctctgtagca 
tggcgataag 
gcggtcgggc 
cgaactgaga 
ggcggacagg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 
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tatccggtaa 
gcctggtatc 
tgatgctcgt 
ttcctggcct 
accccaggct 
acaatttcac 
ctagtggggc 
tgctttttta 
ccggtgctca 
ttctcccggg 
ttcatcagcg 
cgcggcctgg 
gcctccgggc 
cgcgacccgg 
cgagatttcg 
gacgccggct 
aacttgttta 
aataaagcat 
tatcatgtct 



gcggcagggt 
tttatagtcc 
caggggggcg 
tttgctggcc 
ttacacttta 
acaggaaaca 
ccgtgcaatt 
tactaacttg 
ccgcgcgcga 
acttcgtgga 
cggtccagga 
acgagctgta 
cggccatgac 
ccggcaactg 
attccaccgc 
ggatgatcct 
ttgcagctta 
ttttttcact 
gtataccg 



cggaacagga 
tgtcgggttt 
gagcctatgg 
ttttgctcac 
tgcttccggc 
gctatgacca 
gaagccggct 
agcgaaatct 
cgtcgccgga 
ggacgacttc 
ccaggtggtg 
cgccgagtgg 
cgagatcggc 
cgtgcacttc 
cgccttctat 
ccagcgcggg 
taatggttac 
gcattctagt 



gagcgcacga 
cgccacctct 
aaaaacgcca 
atgtaatgtg 
tcgtatgttg 
tgattacgcc 
ggcgccaagc 
ggatccatgg 
gcggtcgagt 
gccggtgtgg 
ccggacaaca 
tcggaggtcg 
gagcagccgt 
gtggccgagg 
gaaaggttgg 
gatctcatgc 
aaataaagca 
tgtggtttgt 



gggagcttcc 
gacttgagcg 
gcaacgcggc 
agttagctca 
tgtggaattg 
aagctacgta 
ttctctgcag 
ccaagttgac 
tctggaccga 
tccgggacga 
ccctggcctg 
tgtccacgaa 
gggggcggga 

agcaggactg 
gcttcggaat 
tggagttctt 
atagcatcac 
ccaaactcat 



agggggaaac 
tcgatttttg 
ctttttacgg 
ctcattaggc 
tgagcggata 
atacgactca 
gattgaagcc 
cagtgccgtt 
ccggctcggg 
cgtgaccctg 
ggtgtgggtg 
cttccgggac 
gttcgccctg 
acacgtgcta 
cgttttccgg 
cgcccacccc 
aaatttcaca 
caatgtatct 



2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3438 



<210> 20 
<211> 3451 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Hindlll Fragment containing the beta -glucuronidase 
coding sequence, the rDNA intergenic spacer, and 
the Mastl sequence 



<400> 20 

aagcttgacc 

ttaggacgtg 

gtaggacgtg 

acgacttgaa 

gactccgcgg 

gttggtggtt 

gatcggcgat 

cacaatggag 

ttaagtattt 

taaatagttt 

tacttgatgt 

ttttttttgt 

tatttttgta 

gttgtacttc 

catgtctact 

tagactgaag 

tgacccccgc 

gttgaaggag 

aaaccattat 

gtcaaaaatg 

ttcactctca 

ttcactagtg 

cccgtgaaat 

gaattgagca 

gcagttttaa 

atcagcgcga 

atgcggtcac 

gcggctatac 

gtatcacagt 

ttaccgacga 

ggatccatcg 

tggtgacgca 

atggtgatgt 

gcaccagcgg 

tctatgaact 

tcggcatccg 



tggaatatcg 
aaatatggcg 
gaatatggca 
aaatgacgaa 
gaattcgatt 
ggtggtcgtg 
ggttggtgtt 
gtgcgtcatg 
tacctatttt 
ttatcgtact 
attggaaatt 
tttattatgt 
aaatatatca 
tttttgtgca 
cctgtcactt 
gcgggaaacg 
cgatgacgcg 
ccactcagcc 
tgcgcgttca 
ctccactgac 
atccaaataa 
gatccccggg 
caaaaaactc 
gcgttggtgg 
cgatcagttc 
agtctttata 
tcattacggc 
gccatttgaa 
ttgtgtgaac 
aaacggcaag 
cagcgtaatg 
tgtcgcgcaa 
cagcgttgaa 
gactttgcaa 
gtacgtcaca 
gtcagtggca 



cgagtaaact 
aggaaaactg 
agaaaactga 
atcactaaaa 
gtgctagcca 
gctggcggtg 
tgcagcggtg 
gttattggtg 
ttacatattt 
tgttttataa 
ttctccattg 
attttttcgt 
tttacaatgt 
tttggtgttg 
gggttttttt 
acaatctgat 
ggacaagccg 
gcgggtttct 
aaagtcgcct 
gttccataaa 
tctgcaccgg 
tacggtcagt 
gacggcctgt 
gaaagcgcgt 
gccgatgcag 
ccgaaaggtt 
aaagtgtggg 
gccgatgtca 
aacgaactga 
aaaaagcagt 
ctctacacca 
gactgtaacc 
ctgcgtgatg 
gtggtgaatc 
gccaaaagcc 
gtgaagggcg 



gaaaatcacg 
aaaaaggtgg 
aaatcatgga 
aacgtgaaaa 
atgtttaaca 
gtggaaaatt 
tttgatatcg 
gttggtcatc 
tttattaaat 
aatattttat 
ttttttctat 
tttataataa 
ttaaaagtca 
tacatgtcta 
ttttaagaca 
catgagcgga 
ttttacgttt 
ggagtttaat 
aaggtcacta 
ttcccctcgg 
atctcgagat 
cccttatgtt 
gggcattcag 
tacaagaaag 
atattcgtaa 
gggcaggcca 
tcaataatca 
cgccgtatgt 
actggcagac 
cttacttcca 
cgccgaacac 
acgcgtctgt 
cggatcaaca 
cgcacctctg 
agacagagtg 
aacagttcct 



gaaaatgaga 
aaaatttaga 
aaatgagaaa 
atgagaaatg 
agatgtcaag 
gcggtggttc 
gaatcactta 
tatatatttt 
ttatgcattg 
tattttatgt 
atttataata 
atatttatta 
tttgtgaata 
ttatgattct 
taatcactag 
gaattaaggg 
ggaactgaca 
gagctaagca 
tcagctagca 
tatccaatta 
cgaattcccg 
acgtcctgta 
tctggatcgc 
ccgggcaatt 
ttatgtgggc 
gcgtatcgtg 
ggaagtgatg 
tattgccggg 
tatcccgccg 
tgatttcttt 
ctgggtggac 
tgactggcag 
ggtggttgca 
gcaaccgggt 
tgatatctac 
gatcaaccac 



aatacacact 
aatgtccact 
catccacttg 
cacactgaag 
cacaatgaat 
gagcggtagt 
tggtggttgt 
tataataata 
tttgtatttt 
gttatattat 
attttcttat 
aaaaaaatat 
tattagctaa 
ctggccaaaa 
tgattatatc 
agtcacgtta 
gaaccgcaac 
catacgtcag 
aatatttctt 
gagtctcata 
cggccgcgaa 
gaaaccccaa 
gaaaactgtg 
gctgtgccag 
aacgtctggt 
ctgcgtttcg 
gagcatcagg 
aaaagtgtac 
ggaatggtga 
aactacgccg 
gatatcaccg 
gtggtggcca 
actggacaag 
gaaggttatc 
ccgctgcgcg 
aaaccgttct 
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actttactgg ctttggccgt catgaagatg cggatttgcg cggcaaagga ttcgataacg 2220 
tgctgatggt gcacgatcac gcattaatgg actggattgg ggccaactcc taccgtacct 2280 
cgcattaccc ttacgctgaa gagatgctcg actgggcaga tgaacatggc atcgtggtga 2340 
ttgatgaaac tgcagctgtc ggctttaacc tctctttagg cattggtttc gaagcgggca 2400 
acaagccgaa agaactgtac agcgaagagg cagtcaacgg ggaaactcag caggcgcact 2460 
tacaggcgat taaagagctg atagcgcgtg acaaaaacca cccaagcgtg gtgatgtgga 2520 
gtattgccaa cgaaccggat acccgtccgc aaggtgcacg ggaatatttc gcgccactgg 2580 
cggaagcaac gcgtaaactc gatccgacgc gtccgatcac ctgcgtcaat gtaatgttct 2640 
gcgacgctca caccgatacc atcagcgatc tctttgatgt gctgtgcctg aaccgttatt 2700 
acggttggta tgtccaaagc ggcgatttgg aaacggcaga gaaggtactg gaaaaagaac 2760 
ttctggcctg gcaggagaaa ctgcatcagc cgattatcat caccgaatac ggcgtggata 2820 
cgttagccgg gctgcactca atgtacaccg acatgtggag tgaagagtat cagtgtgcat 2 880 
ggctggatat gtatcaccgc gtctttgatc gcgtcagcgc cgtcgtcggt gaacaggtat 294 0 
ggaatttcgc cgattttgcg acctcgcaag gcatattgcg cgttggcggt aacaagaagg 3 000 
ggatcttcac ccgcgaccgc aaaccgaagt cggcggcttt tctgctgcaa aaacgctgga 3 060 
ctggcatgaa cttcggtgaa aaaccgcagc agggaggcaa acaatgaatc aacaactctc 3120 
ctggcgcacc atcgtcggct acagcctcgg gaattgcgta ccgagctcga atttccccga 3180 
tcgttcaaac atttggcaat aaagtttctt aagattgaat cctgttgccg gtcttgcgat 3240 
gattatcata taatttctgt tgaattacgt taagcatgta ataattaaca tgtaatgcat 330 0 
gacgttattt atgagatggg tttttatgat tagagtcccg caattataca tttaatacgc 3360 
gatagaaaac aaaatatagc gcgcaaacta ggataaatta tcgcgcgcgg tgtcatctat 342 0 
gttactagat cgggaattcg atatcaagct t 3451 

<210> 21 
<211> 14627 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> pAglla Plasmid 



<400> 21 

catgccaacc 

atagtgcagt 

agtcctaagt 

gttttagtcg 

agagcgccgc 

ccaaccaacg 

ccggcaccag 

acgttgtgac 

ttgccgagcg 

acaccaccac 

agcgttccct 

tgaagtttgg 

tcgaccagga 

ccctgtaccg 

gtgccttccg 

gccaagagga 

cgaagagatc 

ctcaaccgtg 

gccggccagc 

tgagtaaaac 

aatacgcaag 

aagacgacca 

ttagtcgatt 

ccgctaaccg 

cggcgcgact 

atcaaggcag 

accgccgacc 

gcggcctttg 

gcgctggccg 

ccaggcactg 

cgcgaggtcc 

aagagaaaat 

gcaaggctgc 

agttgccggc 

ttaccgagct 



acagggttcc 
cggcttctga 
tacgcgacag 
cataaagtag 
cgctggcctg 
ggccgaactg 
gcgcgaccgc 
agtgaccagg 
catccaggag 
gccggccggc 
aatcatcgac 
cccccgccct 
aggccgcacc 
cgcacttgag 
tgaggacgca 
acaagcatga 
gaggcggaga 
cggctgcatg 
ttggccgctg 
agcttgcgtc 
gggaacgcat 
tcgcaaccca 
ccgatcccca 
ttgtcggcat' 
tcgtagtgat 
ccgacttcgt 
tggtggagct 
tcgtgtcgcg 
ggtacgagct 
ccgccgccgg 
aggcgctggc 
gagcaaaagc 
aacgttggcc 
ggaggatcac 
gctatctgaa 



cctcgggatc 
cgttcagtgc 
gctgccgccc 
aatacttgcg 
ctgggctatg 
cacgcggccg 
ccggagctgg 
ctagaccgcc 
gccggcgcgg 
cgcatggtgt 
cgcacccgga 
accctcaccc 
gtgaaagagg 
cgcagcgagg 
ttgaccgagg 
aaccgcacca 
tgatcgcggc 
aaatcctggc 
aagaaaccga 
atgcggtcgc 
gaaggttatc 
tctagcccgc 
gggcagtgcc 
cgaccgcccg 
cgacggagcg 
gctgattccg 
ggttaagcag 
ggcgatcaaa 
gcccattctt 
cacaaccgtt 
cgctgaaatt 
acaaacacgc 
agcctggcag 
accaagctga 
tacatcgcgc 



aaagtacttt 
agccgtcttc 
tgcccttttc 
actagaaccg 
cccgcgtcag 
gctgcaccaa 
ccaggatgct 
tggcccgcag 
gcctgcgtag 
tgaccgtgtt 
gcgggcgcga 
cggcacagat 
cggctgcact 
aagtgacgcc 
ccgacgccct 
ggacggccag 
cgggtacgtg 
cggtttgtct 
gcgccgccgt 
tgcgtatatg 
gctgtactta 
gccctgcaac 
cgcgattggg 
acgattgacc 
ccccaggcgg 
gtgcagccaa 
cgcattgagg 
ggcacgcgca 
gagtcccgta 
cttgaatcag 
aaatcaaaac 
taagtgccgg 
acacgccagc 
agatgtacgc 
agctaccaga 



gatccaaccc 
tgaaaacgac 
ctggcgtttt 
gagacattac 
caccgacgac 
gctgttttcc 
tgaccaccta 
cacccgcgac 
cctggcagag 
cgccggcatt 
ggccgccaag 
cgcgcacgcc 
gcttggcgtg 
caccgaggcc 
ggcggccgcc 
gacgaaccgt 
ttcgagccgc 
gatgccaagc 
ctaaaaaggt 
atgcgatgag 
accagaaagg 
tcgccggggc 
cggccgtgcg 
gcgacgtgaa 
cggacttggc 
gcccttacga 
tcacggatgg 
tcggcggtga 
tcacgcagcg 
aacccgaggg 
tcatttgagt 
ccgtccgagc 
catgaagcgg 
ggtacgccaa 
gtaaatgagc 



ctccgctgct 
atgtcgcaca 
cttgtcgcgt 
gccatgaaca 
caggacttga 
gagaagatca 
cgccctggcg 
ctactggaca 
ccgtgggccg 
gccgagttcg 
gcccgaggcg 
cgcgagctga 
catcgctcga 
aggcggcgcg 
gagaatgaac 
ttttcattac 
ccgcgcacgt 
tggcggcctg 
gatgtgtatt 
taaataaaca 
cgggtcaggc 
cgatgttctg 
ggaagatcaa 
ggccatcggc 
tgtgtccgcg 
catatgggcc 
aaggctacaa 
ggttgccgag 
cgtgagctac 
cgacgctgcc 
taatgaggta 
gcacgcagca 
gtcaactttc 
ggcaagacca 
aaatgaataa 
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atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga acaaccaggc 2160 
accgacgccg tggaatgccc catgtgtgga ggaacgggcg gttggccagg cgtaagcggc 2220 
tgggttgtct gccggccctg caatggcact ggaaccccca agcccgagga atcggcgtga 2280 
cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg ctgggtgatg acctggtgga 2340 
gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc gaggcagaag cacgccccgg 2400 
tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa tcccggcaac cgccggcagc 2460 
cggtgcgccg tcgattagga agccgcccaa gggcgacgag caaccagatt ttttcgttcc 2520 
gatgctctat gacgtgggca cccgcgatag tcgcagcatc atggacgtgg ccgttttccg 2580 
tctgtcgaag cgtgaccgac gagctggcga ggtgatccgc tacgagcttc cagacgggca 2640 
cgtagaggtt tccgcagggc cggccggcat ggccagtgtg tgggattacg acctggtact 2700 
gatggcggtt tcccatctaa ccgaatccat gaaccgatac cgggaaggga agggagacaa 2760 
gcccggccgc gtgttccgtc cacacgttgc ggacgtactc aagttctgcc ggcgagccga 282 0 
tggcggaaag cagaaagacg acctggtaga aacctgcatt cggttaaaca ccacgcacgt 2 880 
tgccatgcag cgtacgaaga aggccaagaa cggccgcctg gtgacggtat ccgagggtga 2 940 
agccttgatt agccgctaca agatcgtaaa gagcgaaacc gggcggccgg agtacatcga 3 000 
gatcgagcta gctgattgga tgtaccgcga gatcacagaa ggcaagaacc cggacgtgct 3 060 
gacggttcac cccgattact ttttgatcga tcccggcatc ggccgttttc tctaccgcct 3120 
ggcacgccgc gccgcaggca aggcagaagc cagatggttg ttcaagacga tctacgaacg 3180 
cagtggcagc gccggagagt tcaagaagtt ctgtttcacc gtgcgcaagc tgatcgggtc 3240 
aaatgacctg ccggagtacg atttgaagga ggaggcgggg caggctggcc cgatcctagt 3300 
catgcgctac cgcaacctga tcgagggcga agcatccgcc ggttcctaat gtacggagca 3360 
gatgctaggg caaattgccc tagcagggga aaaaggtcga -aaaggtctct ttcctgtgga 3420 
tagcacgtac attgggaacc caaagccgta cattgggaac cggaacccgt acattgggaa 3480 
cccaaagccg tacattggga accggtcaca catgtaagtg actgatataa aagagaaaaa 3540 
aggcgatttt tccgcctaaa actctttaaa acttattaaa actcttaaaa cccgcctggc 3600 
ctgtgcataa ctgtctggcc agcgcacagc cgaagagctg caaaaagcgc ctacccttcg 3660 
gtcgctgcgc tccctacgcc ccgccgcttc gcgtcggcct atcgcggccg ctggccgctc 3 720 
aaaaatggct ggcctacggc caggcaatct accagggcgc ggacaagccg cgccgtcgcc 3780 
actcgaccgc cggcgcccac atcaaggcac cctgcctcgc gcgtttcggt gatgacggtg 3840 
aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg 3 900 
ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg ggcgcagcca 3 960 
tgacccagtc acgtagcgat agcggagtgt atactggctt aactatgcgg catcagagca 4 020 
gattgtactg agagtgcacc atatgcggtg tgaaataccg cacagatgcg taaggagaaa 4080 
ataccgcatc aggcgctctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 4140 
gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg 4200 
ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 4260 
ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 432 0 
acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 43 80 
tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 444 0 
ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 4500 
ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 4560 
ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 4620 
actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 4680 
gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg gtatctgcgc 4740 
tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 4800 
caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 4860 
atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 4920 
acgttaaggg attttggtca tgcattctag gtactaaaac aattcatcca gtaaaatata 4980 
atattttatt ttctcccaat caggcttgat ccccagtaag tcaaaaaata gctcgacata 5040 
ctgttcttcc ccgatatcct ccctgatcga ccggacgcag aaggcaatgt cataccactt 5100 
gtccgccctg ccgcttctcc caagatcaat aaagccactt actttgccat ctttcacaaa 5160 
gatgttgctg tctcccaggt cgccgtggga aaagacaagt tcctcttcgg gcttttccgt 5220 
ctttaaaaaa tcatacagct cgcgcggatc tttaaatgga gtgtcttctt cccagttttc 5280 
gcaatccaca tcggccagat cgttattcag taagtaatcc aattcggcta agcggctgtc 5340 
taagctattc gtatagggac aatccgatat gtcgatggag tgaaagagcc tgatgcactc 5400 
cgcatacagc tcgataatct tttcagggct ttgttcatct tcatactctt ccgagcaaag 54 60 
gacgccatcg gcctcactca tgagcagatt gctccagcca tcatgccgtt caaagtgcag 5520 
gacctttgga acaggcagct ttccttccag ccatagcatc atgtcctttt cccgttccac 5580 
atcataggtg gtccctttat accggctgtc cgtcattttt aaatataggt tttcattttc 5640 
tcccaccagc ttatatacct tagcaggaga cattccttcc gtatctttta cgcagcggta 5700 
tttttcgatc agttttttca attccggtga tattctcatt ttagccattt attatttcct 5760 
tcctcttttc tacagtattt aaagataccc caagaagcta attataacaa gacgaactcc 5820 
aattcactgt tccttgcatt ctaaaacctt aaataccaga aaacagcttt ttcaaagttg 5880 
ttttcaaagt tggcgtataa catagtatcg acggagccga ttttgaaacc gcggtgatca 594 0 
caggcagcaa cgctctgtca tcgttacaat caacatgcta ccctccgcga gatcatccgt 6000 
gtttcaaacc cggcagctta gttgccgttc ttccgaatag catcggtaac atgagcaaag 6060 
tctgccgcct tacaacggct ctcccgctga cgccgtcccg gactgatggg ctgcctgtat 6120 



WO 2002/096923 



PCT/US2002/017451 



-26- 

cgagtggtga ttttgtgccg agctgccggt cggggagctg ttggctggct ggtggcagga 6180 
tatattgtgg tgtaaacaaa ttgacgctta gacaacttaa taacacattg cggacgtttt 6240 
taatgtactg aattaacgcc gaattaattc gggggatctg gattttagta ctggattttg 6300 
gttttaggaa ttagaaattt tattgataga agtattttac aaatacaaat acatactaag 6360 
ggtttcttat atgctcaaca catgagcgaa accctatagg aaccctaatt cccttatctg 6420 
ggaactactc acacattatt atggagaaac tcgagtcaaa tctcggtgac gggcaggacc 6480 
ggacggggcg gtaccggcag gctgaagtcc agctgccaga aacccacgtc atgccagttc 654 0 
ccgtgcttga agccggccgc ccgcagcatg ccgcgggggg catatccgag cgcctcgtgc 6600 
atgcgcacgc tcgggtcgtt gggcagcccg atgacagcga ccacgctctt gaagccctgt 6660 
gcctccaggg acttcagcag gtgggtgtag agcgtggagc ccagtcccgt ccgctggtgg 672 0 
cggggggaga cgtacacggt cgactcggcc gtccagtcgt aggcgttgcg tgccttccag 6780 
gggcccgcgt aggcgatgcc ggcgacctcg ccgtccacct cggcgacgag ccagggatag 6840 
cgctcccgca gacggacgag gtcgtccgtc cactcctgcg gttcctgcgg ctcggtacgg 6900 
aagttgaccg tgcttgtctc gatgtagtgg ttgacgatgg tgcagaccgc cggcatgtcc 6960 
gcctcggtgg cacggcggat gtcggccggg cgtcgttctg ggctcatggt agactcgaga 702 0 
gagatagatt tgtagagaga gactggtgat ttcagcgtgt cctctccaaa tgaaatgaac 7080 
ttccttatat agaggaaggt cttgcgaagg atagtgggat tgtgcgtcat cccttacgtc 7140 
agtggagata tcacatcaat ccacttgctt tgaagacgtg gttggaacgt cttctttttc 7200 
cacgatgctc ctcgtgggtg ggggtccatc tttgggacca ctgtcggcag aggcatcttg 7260 
aacgatagcc tttcctttat cgcaatgatg gcatttgtag gtgccacctt ccttttctac 732 0 
tgtccttttg atgaagtgac agatagctgg gcaatggaat ccgaggaggt ttcccgatat 7380 
taccctttgt tgaaaagtct caatagccct ttggtcttct gagactgtat ctttgatatt 7440 
cttggagtag acgagagtgt cgtgctccac catgttatca catcaatcca cttgctttga 7500 
agacgtggtt ggaacgtctt ctttttccac gatgctcctc gtgggtgggg gtccatcttt 7560 
gggaccactg tcggcagagg catcttgaac gatagccttt cctttatcgc aatgatggca 762 0 
tttgtaggtg ccaccttcct tttctactgt ccttttgatg aagtgacaga tagctgggca 7680 
atggaatccg aggaggtttc ccgatattac cctttgttga aaagtctcaa tagccctttg 7740 
gtcttctgag actgtatctt tgatattctt ggagtagacg agagtgtcgt gctccaccat 7800 
gttggcaagc tgctctagcc aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat 7860 
taatgcagct ggcacgacag gtttcccgac tggaaagcgg gcagtgagcg caacgcaatt 792 0 
aatgtgagtt agctcactca ttaggcaccc caggctttac actttatgct tccggctcgt 7980 
atgttgtgtg gaattgtgag cggataacaa tttcacacag gaaacagcta tgaccatgat 8040 
tacgaattcg agccttgact agagggtcga cggtatacag acatgataag atacattgat 8100 
gagtttggac aaaccacaac tagaatgcag tgaaaaaaat gctttatttg tgaaatttgt 8160 
gatgctattg ctttatttgt aaccattata agctgcaata aacaagttgg ggtgggcgaa 822 0 
gaactccagc atgagatccc cgcgctggag gatcatccag ccggcgtccc ggaaaacgat 8280 
tccgaagccc aacctttcat agaaggcggc ggtggaatcg aaatctcgta gcacgtgtca 8340 
gtcctgctcc tcggccacga agtgcacgca gttgccggcc gggtcgcgca gggcgaactc 8400 
ccgcccccac ggctgctcgc cgatctcggt catggccggc ccggaggcgt cccggaagtt 8460 
cgtggacacg acctccgacc actcggcgta cagctcgtcc aggccgcgca cccacaccca 852 0 
ggccagggtg ttgtccggca ccacctggtc ctggaccgcg ctgatgaaca gggtcacgtc 8580 
gtcccggacc acaccggcga agtcgtcctc cacgaagtcc cgggagaacc cgagccggtc 8640 
ggtccagaac tcgaccgctc cggcgacgtc gcgcgcggtg agcaccggaa cggcactggt 8700 
caacttggcc atggatccag atttcgctca agttagtata aaaaagcagg cttcaatcct 8760 
gcaggaattc gatcgacact ctcgtctact ccaagaatat caaagataca gtctcagaag 882 0 
accaaagggc tattgagact tttcaacaaa gggtaatatc gggaaacctc ctcggattcc 8880 
attgcccagc tatctgtcac ttcatcaaaa ggacagtaga aaaggaaggt ggcacctaca 8940 
aatgccatca ttgcgataaa ggaaaggcta tcgttcaaga tgcctctgcc gacagtggtc 9000 
ccaaagatgg acccccaccc acgaggagca tcgtggaaaa agaagacgtt ccaaccacgt 9060 
cttcaaagca agtggattga tgtgataaca tggtggagca cgacactctc gtctactcca 912 0 
agaatatcaa agatacagtc tcagaagacc aaagggctat tgagactttt caacaaaggg 9180 
taatatcggg aaacctcctc ggattccatt gcccagctat ctgtcacttc atcaaaagga 924 0 
cagtagaaaa ggaaggtggc acctacaaat gccatcattg cgataaagga aaggctatcg 9300 
ttcaagatgc ctctgccgac agtggtccca aagatggacc cccacccacg aggagcatcg 9360 
tggaaaaaga agacgttcca accacgtctt caaagcaagt ggattgatgt gatatctcca 942 0 
ctgacgtaag ggatgacgca caatcccact atccttcgca agaccttcct ctatataagg 9480 
aagttcattt catttggaga ggacacgctg aaatcaccag tctctctcta caaatctatc 954 0 
tctctcgagc tttcgcagat ccgggggggc aatgagatat gaaaaagcct gaactcaccg 960 0 
cgacgtctgt cgagaagttt ctgatcgaaa agttcgacag cgtctccgac ctgatgcagc 9660 
tctcggaggg cgaagaatct cgtgctttca gcttcgatgt aggagggcgt ggatatgtcc 972 0 
tgcgggtaaa tagctgcgcc gatggtttct acaaagatcg ttatgtttat cggcactttg 9780 
catcggccgc gctcccgatt ccggaagtgc ttgacattgg ggagtttagc gagagcctga 984 0 
cctattgcat ctcccgccgt gcacagggtg tcacgttgca agacctgcct gaaaccgaac 990 0 
tgcccgctgt tctacaaccg gtcgcggagg ctatggatgc gatcgctgcg gccgatctta 9960 
gccagacgag cgggttcggc ccattcggac cgcaaggaat cggtcaatac actacatggc 10020 
gtgatttcat atgcgcgatt gctgatcccc atgtgtatca ctggcaaact gtgatggacg 10080 
acaccgtcag tgcgtccgtc gcgcaggctc tcgatgagct gatgctttgg gccgaggact 10140 
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gccccgaagt 
atggccgcat 
aggtcgccaa 
acttcgagcg 
gcattggtct 
gggcgcaggg 
aaatcgcccg 
gtggaaaccg 
atctgtcgat 
ggaattaggg 
gtatttgtat 
agtactaaaa 
gaatatcgcg 
atatggcgag 
atatggcaag 
atgacgaaat 
attcgattgt 
tggtcgtggc 
ttggtgtttg 
gcgtcatggt 
cctatttttt 
atcgtacttg 
tggaaatttt 
tattatgtat 
atatatcatt 
tttgtgcatt 
tgtcacttgg 
gggaaacgac 
atgacgcggg 
actcagccgc 
cgcgttcaaa 
ccactgacgt 
ccaaataatc 
tccccgggta 
aaaaactcga 
gttggtggga 
atcagttcgc 
tctttatacc 
attacggcaa 
catttgaagc 
gtgtgaacaa 
acggcaagaa 
gcgtaatgct 
tcgcgcaaga 
gcgttgaact 
ctttgcaagt 
acgtcacagc 
cagtggcagt 
ttggccgtca 
acgatcacgc 
acgctgaaga 
cagctgtcgg 
aactgtacag 
aagagctgat 
aaccggatac 
gtaaactcga 
ccgataccat 
tccaaagcgg 
aggagaaact 
tgcactcaat 
atcaccgcgt 
attttgcgac 
gcgaccgcaa 
tcggtgaaaa 
cgtcggctac 
ttggcaataa 
atttctgttg 



ccggcacctc 
aacagcggtc 
catcttcttc 
gaggcatccg 
tgaccaactc 
tcgatgcgac 
cagaagcgcg 
acgccccagc 
cgacaagctc 
ttcctatagg 
ttgtaaaata 
tccagatccc 
agtaaactga 
gaaaactgaa 
aaaactgaaa 
cactaaaaaa 
gctagccaat 
tggcggtggt 
cagcggtgtt 
tattggtggt 
acatattttt 
ttttataaaa 
ctccattgtt 
tttttcgttt 
tacaatgttt 
tggtgttgta 
gttttttttt 
aatctgatca 
acaagccgtt 
gggtttctgg 
agtcgcctaa 
tccataaatt 
tgcaccggat 
cggtcagtcc 
cggcctgtgg 
aagcgcgtta 
cgatgcagat 
gaaaggttgg 
agtgtgggtc 
cgatgtcacg 
cgaactgaac 
aaagcagtct 
ctacaccacg 
ctgtaaccac 
gcgtgatgcg 
ggtgaatccg 
caaaagccag 
gaagggcgaa 
tgaagatgcg 
attaatggac 
gatgctcgac 
ctttaacctc 
cgaagaggca 
agcgcgtgac 
ccgtccgcaa 
tccgacgcgt 
cagcgatctc 
cgatttggaa 
gcatcagccg 
gtacaccgac 
ctttgatcgc 
ctcgcaaggc 
accgaagtcg 
accgcagcag 
agcctcggga 
agtttcttaa 
aattacgtta 



gtgcacgcgg 
attgactgga 
tggaggccgt 
gagcttgcag 
tatcagagct 
gcaatcgtcc 
gccgtctgga 
actcgtccga 
gagtttctcc 
gtttcgctca 
cttctatcaa 
ccgaattaat 
aaatcacgga 
aaaggtggaa 
atcatggaaa 
cgtgaaaaat 
gtttaacaag 
ggaaaattgc 
tgatatcgga 
tggtcatcta 
tattaaattt 
tattttatta 
ttttctatat 
tataataaat 
aaaagtcatt 
catgtctatt 
ttaagacata 
tgagcggaga 
ttacgtttgg 
agtttaatga 
ggtcactatc 
cccctcggta 
ctcgagatcg 
cttatgttac 
gcattcagtc 
caagaaagcc 
attcgtaatt 
gcaggccagc 
aataatcagg 
ccgtatgtta 
tggcagacta 
tacttccatg 
ccgaacacct 
gcgtctgttg 
gatcaacagg 
cacctctggc 
acagagtgtg 
cagttcctga 
gatttgcgcg 
tggattgggg 
tgggcagatg 
tctttaggca 
gtcaacgggg 
aaaaaccacc 
ggtgcacggg 
ccgatcacct 
tttgatgtgc 
acggcagaga 
attatcatca 
atgtggagtg 
gtcagcgccg 
atattgcgcg 
gcggcttttc 
ggaggcaaac 
attgcgtacc 
gattgaatcc 
agcatgtaat 



atttcggctc 
gcgaggcgat 
ggttggcttg 
gatcgccacg 
tggttgacgg 
gatccggagc 
ccgatggctg 
gggcaaagaa 
ataataatgt 
tgtgttgagc 
taaaatttct 
tcggcgttaa 
aaatgagaaa 
aatttagaaa 
atgagaaaca 
gagaaatgca 
atgtcaagca 

ggtggttcga 

atcacttatg 
tatattttta 
atgcattgtt 
ttttatgtgt 
ttataataat 
atttattaaa 
tgtgaatata 
atgattctct 
atcactagtg 
attaagggag 
aactgacaga 
gctaagcaca 
agctagcaaa 
tccaattaga 
aattcccgcg 
gtcctgtaga 
tggatcgcga 
gggcaattgc 
atgtgggcaa 
gtatcgtgct 
aagtgatgga 
ttgccgggaa 
tcccgccggg 
atttctttaa 
gggtggacga 
actggcaggt 
tggttgcaac 
aaccgggtga 
atatctaccc 
tcaaccacaa 
gcaaaggatt 
ccaactccta 
aacatggcat 
ttggtttcga 
aaactcagca 
caagcgtggt 
aatatttcgc 
gcgtcaatgt 
tgtgcctgaa 
aggtactgga 
ccgaatacgg 
aagagtatca 
tcgtcggtga 
ttggcggtaa 
tgctgcaaaa 
aatgaatcaa 
gagctcgaat 
tgttgccggt 
aattaacatg 



caacaatgtc 
gttcggggat 
tatggagcag 
actccgggcg 
caatttcgat 
cgggactgtc 
tgtagaagta 
atagagtaga 
gtgagtagtt 
atataagaaa 
aattcctaaa 
ttcagatcaa 
tacacacttt 
tgtccactgt 
tccacttgac 
cactgaagga 
caatgaatgt 
gcggtagtga 
gtggttgtca 
taataatatt 
tgtattttta 
tatattatta 
tttcttattt 
aaaaatatta 
ttagctaagt 
ggccaaaaca 
attatatcta 
tcacgttatg 
accgcaacgt 
tacgtcagaa 
tatttcttgt 
gtctcatatt 
gccgcgaatt 
aaccccaacc 
aaactgtgga 
tgtgccaggc 
cgtctggtat 
gcgtttcgat 
gcatcagggc 
aagtgtacgt 
aatggtgatt 
ctacgccggg 
tatcaccgtg 
ggtggccaat 
tggacaaggc 
aggttatctc 
gctgcgcgtc 
accgttctac 
cgataacgtg 
ccgtacctcg 
cgtggtgatt 
agcgggcaac 
ggcgcactta 
gatgtggagt 
gccactggcg 
aatgttctgc 
ccgttattac 
aaaagaactt 
cgtggatacg 
gtgtgcatgg 
acaggtatgg 
caagaagggg 
acgctggact 
caactctcct 
ttccccgatc 
cttgcgatga 
taatgcatga 



ctgacggaca 
tcccaatacg 
cagacgcgct 
tatatgctcc 
gatgcagctt 
gggcgtacac 
ctcgccgata 
tgccgaccgg 
cccagataag 
cccttagtat 
accaaaatcc 
gcttgacctg 
aggacgtgaa 
aggacgtgga 
gacttgaaaa 
ctccgcggga 
tggtggttgg 
tcggcgatgg 
caatggaggt 
aagtatttta 
aatagttttt 
cttgatgtat 
ttttttgttt 
tttttgtaaa 
tgtacttctt 
tgtctactcc 
gactgaaggc 
acccccgccg 
tgaaggagcc 
accattattg 
caaaaatgct 
cactctcaat 
cactagtgga 
cgtgaaatca 
attgagcagc 
agttttaacg 
cagcgcgaag 
gcggtcactc 
ggctatacgc 
atcacagttt 
accgacgaaa 
atccatcgca 
gtgacgcatg 
ggtgatgtca 
accagcggga 
tatgaactgt 
ggcatccggt 
tttactggct 
ctgatggtgc 
cattaccctt 
gatgaaactg 
aagccgaaag 
caggcgatta 
attgccaacg 
gaagcaacgc 
gacgctcaca 
ggttggtatg 
ctggcctggc 
ttagccgggc 
ctggatatgt 
aatttcgccg 
atcttcaccc 
ggcatgaact 
ggcgcaccat 
gttcaaacat 
ttatcatata 
cgttatttat 



10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
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gagatgggtt tttatgatta gagtcccgca attatacatt taatacgcga tagaaaacaa 1422 0 

aatatagcgc gcaaactagg ataaattatc gcgcgcggtg tcatctatgt tactagatcg 14280 

ggaattcgat atcaagcttg gcactggccg tcgttttaca acgtcgtgac tgggaaaacc 14340 

ctggcgttac ccaacttaat cgccttgcag cacatccccc tttcgccagc tggcgtaata 14400 

gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg cagcctgaat ggcgaatgct 14460 

agagcagctt gagcttggat cagattgtcg tttcccgcct tcagtttaaa ctatcagtgt 14520 

ttgacaggat atattggcgg gtaaacctaa gagaaaagag cgtttattag aataacggat 14 580 

atttaaaagg gcgtgaaaag gtttatccgt tcgtccattt gtatgtg 14 627 

<210> 22 
<211> 4257 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pPUR Plasmid 
<400> 22 

ctgtggaatg tgtgtcagtt agggtgtgga aagtccccag gctccccagc aggcagaagt 60 
atgcaaagca tgcatctcaa ttagtcagca accaggtgtg gaaagtcccc aggctcccca 12 0 
gcaggcagaa gtatgcaaag catgcatctc aattagtcag caaccatagt cccgccccta 180 
actccgccca tcccgcccct aactccgccc agttccgccc attctccgcc ccatggctga 24 0 
ctaatttttt ttatttatgc agaggccgag gccgcctcgg cctctgagct attccagaag 3 00 
tagtgaggag gcttttttgg aggcctaggc ttttgcaaaa agcttgcatg cctgcaggtc 360 
ggccgccacg accggtgccg ccaccatccc ctgacccacg cccctgaccc ctcacaagga 42 0 
gacgaccttc catgaccgag tacaagccca cggtgcgcct cgccacccgc gacgacgtcc 4 80 
cccgggccgt acgcaccctc gccgccgcgt tcgccgacta ccccgccacg cgccacaccg 54 0 
tcgacccgga ccgccacatc gagcgggtca ccgagctgca agaactcttc ctcacgcgcg 600 
tcgggctcga catcggcaag gtgtgggtcg cggacgacgg cgccgcggtg gcggtctgga 660 
ccacgccgga gagcgtcgaa gcgggggcgg tgttcgccga gatcggcccg cgcatggccg 72 0 
agttgagcgg ttcccggctg gccgcgcagc aacagatgga aggcctcctg gcgccgcacc 780 
ggcccaagga gcccgcgtgg ttcctggcca ccgtcggcgt ctcgcccgac caccagggca 840 
agggtctggg cagcgccgtc gtgctccccg gagtggaggc ggccgagcgc gccggggtgc 900 
ccgccttcct ggagacctcc gcgccccgca acctcccctt ctacgagcgg ctcggcttca 960 
ccgtcaccgc cgacgtcgag gtgcccgaag gaccgcgcac ctggtgcatg acccgcaagc 1020 
ccggtgcctg acgcccgccc cacgacccgc agcgcccgac cgaaaggagc gcacgacccc 1080 
atggctccga ccgaagccga cccgggcggc cccgccgacc ccgcacccgc ccccgaggcc 1140 
caccgactct agaggatcat aatcagccat accacatttg tagaggtttt acttgcttta 12 00 
aaaaacctcc cacacctccc cctgaacctg aaacataaaa tgaatgcaat tgttgttgtt 12 60 
aacttgttta ttgcagctta taatggttac aaataaagca atagcatcac aaatttcaca 1320 
aataaagcat ttttttcact gcattctagt tgtggtttgt ccaaactcat caatgtatct 1380 
tatcatgtct ggatccccag gaagctcctc tgtgtcctca taaaccctaa cctcctctac 1440 
ttgagaggac attccaatca taggctgccc atccaccctc tgtgtcctcc tgttaattag 1500 
gtcacttaac aaaaaggaaa ttgggtaggg gtttttcaca gaccgctttc taagggtaat 1560 
tttaaaatat ctgggaagtc ccttccactg ctgtgttcca gaagtgttgg taaacagccc 1620 
acaaatgtca acagcagaaa catacaagct gtcagctttg cacaagggcc caacaccctg 1680 
ctcatcaaga agcactgtgg ttgctgtgtt agtaatgtgc aaaacaggag gcacattttc 1740 
cccacctgtg taggttccaa aatatctagt gttttcattt ttacttggat caggaaccca 1800 
gcactccact ggataagcat tatccttatc caaaacagcc ttgtggtcag tgttcatctg 1860 
ctgactgtca actgtagcat tttttggggt tacagtttga gcaggatatt tggtcctgta 1920 
gtttgctaac acaccctgca gctccaaagg ttccccacca acagcaaaaa aatgaaaatt 1980 
tgacccttga atgggttttc cagcaccatt ttcatgagtt ttttgtgtcc ctgaatgcaa 2040 
gtttaacata gcagttaccc caataacctc agttttaaca gtaacagctt cccacatcaa 2100 
aatatttcca caggttaagt cctcatttaa attaggcaaa ggaattcttg aagacgaaag 2160 
ggcctcgtga tacgcctatt tttataggtt aatgtcatga taataatggt ttcttagacg 2220 
tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt tttctaaata 2280 
cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca ataatattga 2340 
aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt ttttgcggca 2400 
ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga tgctgaagat 2460 
cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcggtaa gatccttgag 2520 
agttttcgcc ccgaagaacg ttttccaatg atgagcactt ttaaagttct gctatgtggc 2580 
gcggtattat cccgtgttga cgccgggcaa gagcaactcg gtcgccgcat acactattct 2640 
cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga tggcatgaca 2700 
gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc caacttactt 2760 
ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat gggggatcat 282 0 
gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa cgacgagcgt 2880 
gacaccacga tgcctgcagc aatggcaaca acgttgcgca aactattaac tggcgaacta 294 0 
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cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
gataatctca 
gtagaaaaga 
caaacaaaaa 
ctttttccga 
tagccgtagt 
ctaatcctgt 
tcaagacgat 
cagcccagct 
gaaagcgcca 
ggaacaggag 
gtcgggtttc 
agcctatgga 
tttgctcaca 
tttgagtgag 
gaggaagcgg 
caccgcatat 



cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaaaact 
tgaccaaaat 
tcaaaggatc 
aaccaccgct 
aggtaactgg 
taggccacca 
taccagtggc 
agttaccgga 
tggagcgaac 
cgcttcccga 
agcgcacgag 
gccacctctg 
aaaacgccag 
tgttctttcc 
ctgataccgc 
aagagcgcct 
ggtgcactct 



acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
tcatttttaa 
cccttaacgt 
ttcttgagat 
accagcggtg 
cttcagcaga 
cttcaagaac 
tgctgccagt 
taaggcgcag 
gacctacacc 
agggagaaag 
ggagcttcca 
acttgagcgt 
caacgcggcc 
tgcgttatcc 
tcgccgcagc 
gatgcggtat 
cagtacaatc 



-29- 

gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
tttaaaagga 
gagttttcgt 
cctttttttc 
gtttgtttgc 
gcgcagatac 
tctgtagcac 
ggcgataagt 
cggtcgggct 
gaactgagat 
gcggacaggt 
gggggaaacg 
cgatttttgt 
tttttacggt 
cctgattctg 
cgaacgaccg 
tttctcctta 
tgctctgatg 



aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
tctaggtgaa 
tccactgagc 
tgcgcgtaat 
cggatcaaga 
caaatactgt 
cgcctacata 
cgtgtcttac 
gaacgggggg 
acctacagcg 
atccggtaag 
cctggtatct 
gatgctcgtc 
tcctggcctt 
tggataaccg 
agcgcagcga 
cgcatctgtg 
ccgcatagtt 



agttgcagga 
tggagccggt 
ctcccgtatc 
acagatcgct 
ctcatatata 
gatccttttt 
gtcagacccc 
ctgctgcttg 
gctaccaact 
ccttctagtg 
cctcgctctg 
cgggttggac 
ttcgtgcaca 
tgagctatga 
cggcagggtc 
ttatagtcct 
aggggggcgg 
ttgctggcct 
tattaccgcc 
gtcagtgagc 
cggtatttca 
aagccag 



3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4257 



<210> 23 
<211> 2713 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> pNEB193 Plasmid 



<400> 23 

tcgcgcgttt 

cagcttgtct 

ttggcgggtg 

accatatgcg 

attcgccatt 

tacgccagct 

tttcccagtc 

gcgccggatc 

gcgtaatcat 

aacatacgag 

acattaattg 

cattaatgaa 

tcctcgctca 

tcaaaggcgg 

gcaaaaggcc 

aggctccgcc 

ccgacaggac 

gttccgaccc 

ctttctcata 

ggctgtgtgc 

cttgagtcca 

attagcagag 

ggctacacta 

aaaagagttg 

gtttgcaagc 

tctacggggt 

ttatcaaaaa 

taaagtatat 

atctcagcga 

actacgatac 

cgctcaccgg 

agtggtcctg 

gtaagtagtt 

gtgtcacgct 

gttacatgat 



cggtgatgac 
gtaagcggat 
tcggggctgg 
gtgtgaaata 
caggctgcgc 
ggcgaaaggg 
acgacgttgt 
cttaattaag 
ggtcatagct 
ccggaagcat 
cgttgcgctc 
tcggccaacg 
ctgactcgct 
taatacggtt 
agcaaaaggc 
cccctgacga 
tataaagata 
tgccgcttac 
gctcacgctg 
acgaaccccc 
acccggtaag 
cgaggtatgt 
gaaggacagt 
gtagctcttg 
agcagattac 
ctgacgctca 
ggatcttcac 
atgagtaaac 
tctgtctatt 
gggagggctt 
ctccagattt 
caactttatc 
cgccagttaa 
cgtcgtttgg 
cccccatgtt 



ggtgaaaacc 
gccgggagca 
cttaactatg 
ccgcacagat 
aactgttggg 
ggatgtgctg 
aaaacgacgg 
tctagagtcg 
gtttcctgtg 
aaagtgtaaa 
actgcccgct 
cgcggggaga 
gcgctcggtc 
atccacagaa 
caggaaccgt 
gcatcacaaa 
ccaggcgttt 
cggatacctg 
taggtatctc 
cgttcagccc 
acacgactta 
aggcggtgct 
atttggtatc 
atccggcaaa 
gcgcagaaaa 
gtggaacgaa 
ctagatcctt 
ttggtctgac 
tcgttcatcc 
accatctggc 
atcagcaata 
cgcctccatc 
tagtttgcgc 
tatggcttca 
gtgcaaaaaa 



tctgacacat 
gacaagcccg 
cggcatcaga 
gcgtaaggag 
aagggcgatc 
caaggcgatt 
ccagtgaatt 
actgtttaaa 
tgaaattgtt 
gcctggggtg 
ttccagtcgg 
ggcggtttgc 
gttcggctgc 
tcaggggata 
aaaaaggccg 
aatcgacgct 
ccccctggaa 
tccgcctttc 
agttcggtgt 
gaccgctgcg 
tcgccactgg 
acagagttct 
tgcgctctgc 
caaaccaccg 
aaaggatctc 
aactcacgtt 
ttaaattaaa 
agttaccaat 
atagttgcct 
cccagtgctg 
aaccagccag 
cagtctatta 
aacgttgttg 
ttcagctccg 
gcggttagct 



gcagctcccg 
tcagggcgcg 
gcagattgta 
aaaataccgc 
ggtgcgggcc 
aagttgggta 
cgagctcggt 
cctgcaggca 
atccgctcac 
cctaatgagt 
gaaacctgtc 
gtattgggcg 
ggcgagcggt 
acgcaggaaa 
cgttgctggc 
caagtcagag 
gctccctcgt 
tcccttcggg 
aggtcgttcg 
ccttatccgg 
cagcagccac 
tgaagtggtg 
tgaagccagt 
ctggtagcgg 
aagaagatcc 
aagggatttt 
aatgaagttt 
gcttaatcag 
gactccccgt 
caatgatacc 
ccggaagggc 
attgttgccg 
ccattgctac 
gttcccaacg 
ccttcggtcc 



gagacggtca 
tcagcgggtg 
ctgagagtgc 
atcaggcgcc 
tcttcgctat 
acgccagggt 
acccgggggc 
tgcaagcttg 
aattccacac 
gagctaactc 
gtgccagctg 
ctcttccgct 
atcagctcac 
gaacatgtga 
gtttttccat 
gtggcgaaac 
gcgctctcct 
aagcgtggcg 
ctccaagctg 
taactatcgt 
tggtaacagg 
gcctaactac 
taccttcgga 
tggttttttt 
tttgatcttt 
ggt cat gaga 
taaatcaatc 
tgaggcacct 
cgtgtagata 
gcgagaccca 
cgagcgcaga 
ggaagctaga 
aggcatcgtg 
atcaaggcga 
tccgatcgtt 



60 

120 

180 

240 

300 

360 

420 

480 

54 0 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 
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gtcagaagta agttggccgc agtgttatca ctcatggtta tggcagcact gcataattct 2160 
cttactgtca tgccatccgt aagatgcttt tctgtgactg gtgagtactc aaccaagtca 2220 
ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat acgggataat 2280 
accgcgccac atagcagaac tttaaaagtg ctcatcattg gaaaacgttc ttcggggcga 2340 
aaactctcaa ggatcttacc gctgttgaga tccagttcga tgtaacccac tcgtgcaccc 2400 
aactgatctt cagcatcttt tactttcacc agcgtttctg ggtgagcaaa aacaggaagg 2460 
caaaatgccg caaaaaaggg aataagggcg acacggaaat gttgaatact catactcttc 2520 
ctttttcaat attattgaag catttatcag ggttattgtc tcatgagcgg atacatattt 2 580 
gaatgtattt agaaaaataa acaaataggg gttccgcgca catttccccg aaaagtgcca 2640 
cctgacgtct aagaaaccat tattatcatg acattaacct ataaaaatag gcgtatcacg 2700 
aggccctttc gtc 2713 

<210> 24 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> at t PUP Primer 
<400> 24 

ccttgcgcta atgctctgtt acagg 25 

<210> 25 
<211> 26 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> attPDWN Primer 
<400> 25 

cagaggcagg gagtgggaca aaattg 

<210> 26 

<211> 4346 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pSV40193attPsensePUR Plasmid 



26 



<400> 26 

ccggtgccgc caccatcccc tgacccacgc ccctgacccc tcacaaggag acgaccttcc 60 
atgaccgagt acaagcccac ggtgcgcctc gccacccgcg acgacgtccc ccgggccgta 120 
cgcaccctcg ccgccgcgtt cgccgactac cccgccacgc gccacaccgt cgacccggac 180 
cgccacatcg agcgggtcac cgagctgcaa gaactcttcc tcacgcgcgt cgggctcgac 240 
atcggcaagg tgtgggtcgc ggacgacggc gccgcggtgg cggtctggac cacgccggag 3 00 
agcgtcgaag cgggggcggt gttcgccgag atcggcccgc gcatggccga gttgagcggt 3 60 
tcccggctgg ccgcgcagca acagatggaa ggcctcctgg cgccgcaccg gcccaaggag 420 
cccgcgtggt tcctggccac cgtcggcgtc tcgcccgacc accagggcaa gggtctgggc 480 
agcgccgtcg tgctccccgg agtggaggcg gccgagcgcg ccggggtgcc cgccttcctg 540 
gagacctccg cgccccgcaa cctccccttc tacgagcggc tcggcttcac cgtcaccgcc 600 
gacgtcgagg tgcccgaagg accgcgcacc tggtgcatga cccgcaagcc cggtgcctga 660 
cgcccgcccc acgacccgca gcgcccgacc gaaaggagcg cacgacccca tggctccgac 72 0 
cgaagccgac ccgggcggcc ccgccgaccc cgcacccgcc cccgaggccc accgactcta 780 
gaggatcata atcagccata ccacatttgt agaggtttta cttgctttaa aaaacctccc 840 
acacctcccc ctgaacctga aacataaaat gaatgcaatt gttgttgtta acttgtttat 90 0 
tgcagcttat aatggttaca aataaagcaa tagcatcaca aatttcacaa ataaagcatt 960 
tttttcactg cattctagtt gtggtttgtc caaactcatc aatgtatctt atcatgtctg 1020 
gatccgcgcc ggatccttaa ttaagtctag agtcgactgt ttaaacctgc aggcatgcaa 1080 
gcttggcgta atcatggtca tagctgtttc ctgtgtgaaa ttgttatccg ctcacaattc 114 0 
cacacaacat acgagccgga agcataaagt gtaaagcctg gggtgcctaa tgagtgagct 1200 
aactcacatt aattgcgttg cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc 1260 
agctgcatta atgaatcggc caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt 1320 
ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg* gctgcggcga gcggtatcag 1380 
ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 1440 
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tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttttgtttg 
tcttttctac 
tgagattatc 
caatctaaag 
cacctatctc 
agataactac 
acccacgctc 
gcagaagtgg 
ctagagtaag 
tcgtggtgtc 
ggcgagttac 
tcgttgtcag 
attctcttac 
agtcattctg 
ataataccgc 
ggcgaaaact 
cacccaactg 
gaaggcaaaa 
tcttcctttt 
tatttgaatg 
tgccacctga 
tcacgaggcc 
agctcccgga 
agggcgcgtc 
agattgtact 
aataccgcat 
tgcgggcctc 
gttgggtaac 
agctgtggaa 
gtatgcaaag 
cagcaggcag 
taactccgcc 
gactaatttt 
agtagtgagg 
tcactaatac 
tatgtagtct 
gtttctcgtt 
tgttgcaacg 
cccactccct 



aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaagg 
agttggtagc 
caagcagcag 
ggggtctgac 
aaaaaggatc 
tatatatgag 
agcgatctgt 
gatacgggag 
accggctcca 
tcctgcaact 
tagttcgcca 
acgctcgtcg 
atgatccccc 
aagtaagttg 
tgtcatgcca 
agaatagtgt 
gccacatagc 
ctcaaggatc 
atcttcagca 
tgccgcaaaa 
tcaatattat 
tatttagaaa 
cgtctaagaa 
ctttcgtctc 
gacggtcaca 
agcgggtgtt 
gagagtgcac 
caggcgccat 
ttcgctatta 
gccagggttt 
tgtgtgtcag 
catgcatctc 
aagtatgcaa 
catcccgccc 
ttttatttat 
aggctttttt 
catctaagta 
gttttttatg 
cagctttttt 
aacaggtcac 
gcctctgggg 



aaggccagga 
gacgagcatc 
agataccagg 
cttaccggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgcgca 
gctcagtgga 
ttcacctaga 
taaacttggt 
ctatttcgtt 
ggcttaccat 
gatttatcag 
ttatccgcct 
gttaatagtt 
tttggtatgg 
atgttgtgca 
gccgcagtgt 
tccgtaagat 
atgcggcgac 
agaactttaa 
ttaccgctgt 
tcttttactt 
aagggaataa 
tgaagcattt 
aataaacaaa 
accattatta 
gcgcgtttcg 
gcttgtctgt 
ggcgggtgtc 
catatgcggt 
tcgccattca 
cgccagctgg 
tcccagtcac 
ttagggtgtg 
aattagtcag 
age at gc ate 
ctaactccgc 
geagaggecg 
ggaggctegg 
gttgattcat 
caaaatctaa 
atactaagtt 
tatcagtcaa 
ggcgcg 



acegtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgetacaga 
gtatctgege 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
tccttttaaa 
ctgacagtta 
catccatagt 
ctggccccag 
caataaacca 
ccatccagtc 
tgcgcaacgt 
cttcattcag 
aaaaagcggt 
tatcactcat 
gcttttctgt 
egagttgetc 
aagtgctcat 
tgagatccag 
tcaccagcgt 
gggcgacacg 
atcagggtta 
taggggttcc 
tcatgacatt 
gtgatgaegg 
aagcggatgc 
ggggctggct 
gtgaaatacc 
ggctgcgcaa 
cgaaaggggg 
gacgttgtaa 
gaaagtcccc 
caaccaggtg 
tcaattagtc 
ccagttccgc 
aggccgcctc 
tacccccttg 
agtgactgca 
tttaatatat 
ggcattataa 
aataaaatca 



ggccgcgttg 
aegctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tetgetgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
ttaaaaatga 
ecaatgetta 
tgcctgactc 
tgctgcaatg 
gccagccgga 
tattaattgt 
tgttgccatt 
ctccggttcc 
tagctccttc 
ggttatggca 
gactggtgag 
ttgcccggcg 
cattggaaaa 
ttcgatgtaa 
ttctgggtga 
gaaatgttga 
ttgtctcatg 
gcgcacattt 
aacctataaa 
tgaaaacctc 
egggagcaga 
taactatgcg 
geacagatge 
ctgttgggaa 
atgtgctgca 
aacgacggcc 
aggctcccca 
tggaaagtcc 
agcaaccata 
ccattctccg 
ggectctgag 
egctaatget 
tatgttgtgt 
tgatatttat 
aaaagcattg 
ttatttgatt 



ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tegggaageg 
gttcgctcca 
teeggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
agttttaaat 
atcagtgagg 
cccgtcgtgt 
ataccgegag 
agggecgage 
tgccgggaag 
gctacaggca 
caacgatcaa 
ggtcctccga 
geactgeata 
tactcaacca 
teaataeggg 
cgttcttcgg 
cccactcgtg 
gcaaaaacag 
atactcatac 
ageggataca 
ccccgaaaag 
aataggcgta 
tgacacatgc 
caagcccgtc 
gcatcagagc 
gtaaggagaa 
gggegategg 
aggegattaa 
agtgaattcg 
gcaggcagaa 
ccaggctccc 
gtcccgcccc 
ccccatggct 
ctattccaga 
ctgttacagg 
tttacagtat 
atcattttac 
cttatcaatt 
tcaattttgt 



1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4346 



<210> 27 
<211> 5855 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCXLamlntR Plasmid 



<400> 27 

gtcgacattg 

geccatatat 

ccaacgaccc 

ggactttcca 

atcaagtgta 

cctggcatta 

tattagtcat 

atctcccccc 



attattgact 
ggagttccgc 
ccgcccattg 
ttgacgtcaa 
teatatgeca 
tgcccagtac 
cgctattacc 
cctccccacc 



agttattaat 
gttacataac 
aegtcaataa 
tgggtggact 
agtacgcccc 
atgaccttat 
atgggtcgag 
cccaattttg 



agtaatcaat 
ttacggtaaa 
tgacgtatgt 
atttaeggta 
etattgaegt 
gggactttcc 
gtgagcccca 
tatttattta 



taeggggtea 
tggcccgcct 
tcccatagta 
aactgcccac 
caatgaeggt 
tacttggcag 
cgttctgctt 
ttttttaatt 



ttagttcata 60 
ggctgaccgc 12 0 
aegecaatag 180 
ttggcagtac 240 
aaatggcccg 300 
tacatctacg 360 
cactctcccc 420 
attttgtgca 480 
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gcgatggggg 
gggcggggcg 
tccttttatg 

gggagtcgct 

ccggctctga 
gggctgtaat 
ccttaaaggg 
tgtgtgtgtg 
cgggcgcggc 
ggtgccccgc 
tgggggggtg 
cctccccgag 
gcggggctcg 
ccgcctcggg 
gtcgaggcgc 
gacttccttt 
tagcgggcgc 
cgtgcgtcgc 
acggctgcct 
gctctagagc 
acgtgctggt 
gtcatgagcg 
acagggaccc 
ctgaagctat 
cgagaatcaa 
tcctggccag 
caataaggag 
caatgctcaa 
cactgagcga 
ctgccactcg 
tgaaaattta 
ctgttgttac 
atggatatct 
tgcatattga 
ttggcggaga 
caaggtattt 
cctttcacga 
ttgctcaaca 
gaggcaggga 
cctatcagaa 
tttttccctc 
gctaataaag 
tcggaaggac 
gtttggcaac 
cagtatatga 
ggttagattt 
tccttacatg 
gtccctcttc 
atagctgttt 
aagcataaag 
gcgctcactg 
tagtcagcaa 
tccgcccatt 
gcctcggcct 
tgcaaaaagc 
caaatttcac 
tcaatgtatc 
aggcggtttg 
cgttcggctg 
atcaggggat 
taaaaaggcc 
aaatcgacgc 
tccccctgga 
gtccgccttt 
cagttcggtg 
cgaccgctgc 
atcgccactg 



cggggggggg 
aggcggagag 
gcgaggcggc 
gcgttgcctt 
ctgaccgcgt 
tagcgcttgg 
ctccgggagg 
cgtggggagc 
gcggggcttt 
ggtgcggggg 
agcagggggt 
ttgctgagca 
ccgtgccggg 
ccggggaggg 
ggcgagccgc 
gtcccaaatc 
gggcgaagcg 
cgcgccgccg 
tcggggggga 
ctctgctaac 
tgttgtgctg 
ccgggattta 
aaggacgggt 
acaggccaac 
cagtgataat 
cagaggaatc 
gggtctgcct 
tggatacata 
tgcattccga 
cgcagcaaaa 
tcaagcagca 
cgggcaacga 
ttatgtcgag 
tgctctcgga 
aaccataatt 
tatgcgcgca 
gttgcgcagt 
tcttctcggg 
gtgggacaaa 
ggtggtggct 
tgccaaaaat 
gaaatttatt 
atatgggagg 
atatgccata 
aacagccccc 
tttttatatt 
ttttactagc 
tcttatgaag 
cctgtgtgaa 
tgtaaagcct 
cccgctttcc 
ccatagtccc 
ctccgcccca 
ctgagctatt 
taacttgttt 
aaataaagca 
ttatcatgtc 
cgtattgggc 
cggcgagcgg 
aacgcaggaa 
gcgttgctgg 
tcaagtcaga 
agctccctcg 
ctcccttcgg 
taggtcgttc 
gccttatccg 
gcagcagcca 



gggggcgcgc 

gtgcggcggc 
ggcggcggcg 
cgccccgtgc 
tactcccaca 
tttaatgacg 
gccctttgtg 
gccgcgtgcg 
gtgcgctccg 
ggctgcgagg 
gtgggcgcgg 
cggcccggct 
cggggggtgg 
ctcgggggag 
agccattgcc 
tggcggagcc 
gtgcggcgcc 
tccccttctc 
cggggcaggg 
catgttcatg 
tctcatcatt 
ccccctaacc 
aaagagtttg 
attgagttat 
tccgttacgt 
aagcagaaga 
gatgctccac 
gacgagggca 
gaggcaatag 
tctagagtaa 
gaatcatcac 
gttggtgatt 
caaagcaaaa 
atatcaatga 
gcatctactc 
cgaaaagcat 
ttgtctgcaa 
cataagtcgg 
attgaaatca 
ggtgtggcca 
tatggggaca 
ttcattgcaa 
gcaaatcatt 
tgctggctgc 
tgctgtccat 
ttgttttgtg 
cagatttttc 
atccctcgac 
attgttatcc 
ggggtgccta 
agtcgggaaa 
gcccctaact 
tggctgacta 
ccagaagtag 
attgcagctt 
tttttttcac 
tggatccgct 
gctcttccgc 
tatcagctca 
agaacatgtg 
cgtttttcca 
ggtggcgaaa 
tgcgctctcc 
gaagcgtggc 
gctccaagct 
gtaactatcg 
ctggtaacag 



gccaggcggg 
agccaatcag 
gccctataaa 
cccgctccgc 
ggtgagcggg 
gctcgtttct 
cgggggggag 
gcccgcgctg 
cgtgtgcgcg 
ggaacaaagg 
cggtcgggct 
tcgggtgcgg 
cggcaggtgg 
gggcgcggcg 
ttttatggta 
gaaatctggg 
ggcaggaagg 
catctccagc 
cggggttcgg 
ccttcttctt 
ttggcaaaga 
tttatataag 
gattaggcag 
tttcaggaca 
tacattcatg 
cactcataaa 
ttgaagacat 
aggcggcgtc 
ctgaaggcca 
ggagatcaag 
catgttggct 
tatgcgaaat 
caggcgtaaa 
aggaaacact 
gtcgcgaacc 
caggtctttc 
gactctatga 
acaccatggc 
aataagaatt 
atgccctggc 
tcatgaagcc 
tagtgtgttg 
taaaacatca 
catgaacaaa 
tccttattcc 
ttattttttt 
ctcctctcct 
ctgcagccca 
gctcacaatt 
atgagtgagc 
cctgtcgtgc 
ccgcccatcc 
atttttttta 
tgaggaggct 
ataatggtta 
tgcattctag 
gcattaatga 
ttcctcgctc 
ctcaaaggcg 
agcaaaaggc 
taggctccgc 
cccgacagga 
tgttccgacc 
gctttctcaa 
gggctgtgtg 
tcttgagtcc 
gattagcaga 



gcggggcggg 
agcggcgcgc 
aagcgaagcg 
gccgcctcgc 
cgggacggcc 
tttctgtggc 
cggctcgggg 
cccggcggct 
aggggagcgc 
ctgcgtgcgg 
gtaacccccc 
ggctccgtgc 
gggtgccggg 
gccccggagc 
atcgtgcgag 
aggcgccgcc 
aaatgggcgg 
ctcggggctg 
cttctggcgt 
tttcctacag 
attcatggga 
aaacaatgga 
agacaggcga 
caaacacaag 
gcttgatcgc 
ttacatgagc 
caccacaaaa 
agccaagtta 
tataacaaca 
acttacggct 
cagacttgca 
gaagtggtct 
aattgccatc 
tgataaatgc 
gctttcatcc 
cttcgaaggg 
gaagcagata 
atcacagtat 
cactcctcag 
tcacaaatac 
ccttgagcat 
gaattttttg 
gaatgagtat 
ggtggctata 
atagaaaagc 
ctttaacatc 
gactactccc 
agcttggcgt 
ccacacaaca 
taactcacat 
cagcggatcc 
cgcccctaac 
tttatgcaga 
tttttggagg 
caaataaagc 
ttgtggtttg 
atcggccaac 
actgactcgc 
gtaatacggt 
cagcaaaagg 
ccccctgacg 
ctataaagat 
ctgccgctta 
tgctcacgct 
cacgaacccc 
aacccggtaa 
gcgaggtatg 



gcgaggggcg 
tccgaaagtt 
cgcggcgggc 
gccgcccgcc 
cttctcctcc 
tgcgtgaaag 
ggtgcgtgcg 
gtgagcgctg 
ggccgggggc 
ggtgtgtgcg 
cctgcacccc 
ggggcgtggc 
cggggcgggg 
gccggcggct 
agggcgcagg 
gcaccccctc 
ggagggcctt 
ccgcaggggg 
gtgaccggcg 
ctcctgggca 
agaaggcgaa 
tattactgct 
atcgcaatca 
cctctgacag 
tacgaaaaaa 
aaaattaaag 
gaaattgcgg 
atcagatcaa 
aaccatgtcg 
gacgaatacc 
atggaactgg 
gatatcgtag 
ccaacagcat 
aaagagattc 
ggcacagtat 
gatccgceta 
agcgataagt 
cgtgatgaca 
gtgcaggctg 
cactgagatc 
ctgacttctg 
tgtctctcac 
ttggtttaga 
aagaggtcat 
cttgacttga 
cctaaaattt 
agtcatagct 
aatcatggtc 
tacgagccgg 
taattgcgtt 
gcatctcaat 
tccgcccagt 
ggccgaggcc 
cctaggcttt 
aatagcatca 
tccaaactca 
gcgcggggag 
tgcgctcggt 
tatccacaga 
cc aggaaccg 
agcatcacaa 
accaggcgtt 
ccggatacct 
gtaggtatct 
ccgttcagcc 
gacacgactt 
taggcggtgc 



540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 
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1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 
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3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 
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tacagagttc 
ctgcgctctg 
acaaaccacc 
aaaaggatct 
aaactcacgt 
tttaaattaa 
cagttaccaa 
catagttgcc 
ccccagtgct 
aaaccagcca 
ccagtctatt 
caacgttgtt 
attcagctcc 
agcggttagc 
actcatggtt 
ttctgtgact 
ttgctcttgc 
gctcatcatt 
atccagttcg 
cagcgtttct 
gacacggaaa 
gggttattgt 
ggttccgcgc 



ttgaagtggt 
ctgaagccag 
gctggtagcg 
caagaagatc 
taagggattt 
aaatgaagtt 
tgcttaatca 
tgactccccg 
gcaatgatac 
gccggaaggg 
aattgttgcc 
gccattgcta 
ggttcccaac 
tccttcggtc 
atggcagcac 
ggtgagtact 
ccggcgtcaa 
ggaaaacgtt 
atgtaaccca 
gggtgagcaa 
tgttgaatac 
ctcatgagcg 
acatttcccc 



ggcctaacta 
ttaccttcgg 
gtggtttttt 
ctttgatctt 
tggtcatgag 
ttaaatcaat 
gtgaggcacc 
tcgtgtagat 
cgcgagaccc 
ccgagcgcag 
gggaagctag 
caggcatcgt 
gatcaaggcg 
ctccgatcgt 
tgcataattc 
caaccaagtc 
tacgggataa 
cttcggggcg 
ctcgtgcacc 
aaacaggaag 
tcatactctt 
gatacatatt 
gaaaagtgcc 



-33- 

cggctacact 
aaaaagagtt 
tgtttgcaag 
ttctacgggg 
attatcaaaa 
ctaaagtata 
tatctcagcg 
aactacgata 
acgctcaccg 
aagtggtcct 
agtaagtagt 
ggtgtcacgc 
agttacatga 
tgtcagaagt 
tcttactgtc 
attctgagaa 
taccgcgcca 
aaaactctca 
caactgatct 
gcaaaatgcc 
cctttttcaa 
tgaatgtatt 
acctg 



agaaggacag 
ggtagctctt 
cagcagatta 
tctgacgctc 
aggatcttca 
tatgagtaaa 
atctgtctat 
cgggagggct 
gctccagatt 
gcaactttat 
tcgccagtta 
tcgtcgtttg 
tcccccatgt 
aagttggccg 
atgccatccg 
tagtgtatgc 
catagcagaa 
aggatcttac 
tcagcatctt 
gcaaaaaagg 
tattattgaa 
tagaaaaata 



tatttggtat 
gatccggcaa 
cgcgcagaaa 
agtggaacga 
cctagatcct 
cttggtctga 
ttcgttcatc 
taccatctgg 
tatcagcaat 
ccgcctccat 
atagtttgcg 
gtatggcttc 
tgtgcaaaaa 
cagtgttatc 
taagatgctt 
ggcgaccgag 
ctttaaaagt 
cgctgttgag 
ttactttcac 
gaataagggc 
gcatttatca 
aacaaatagg 



<210> 28 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> 5PacSV4 0 Primer 
<400> 28 

ctgttaatta actgtggaat gtgtgtcagt tagggtg 

<210> 29 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Antisense Zeo Primer 
<400> 29 

tgaacagggt cacgtcgtcc 

<210> 30 

<211> 1032 

<212> DNA 

<213> Escherichia Coli 



4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5855 



37 



20 



<220> 

<221> CDS 

<222> (1) . . . (1032) 

<223> nucleotide sequence encoding Cre recombinase 
<400> 30 

atg tec aat tta ctg acc gta cac caa aat ttg cct gca tta ccg gtc 
Met Ser Asn Leu Leu Thr Val His Gin Asn Leu Pro Ala Leu Pro Val 
15 10 15 



48 



gat gca acg agt gat gag gtt cgc aag aac ctg atg gac atg ttc agg 96 
Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 
20 25 30 

gat cgc cag gcg ttt tct gag cat acc tgg aaa atg ctt ctg tec gtt 144 
Asp Arg Gin Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 
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35 40 45 

tgc egg teg tgg gcg gca tgg tgc aag ttg aat aac egg aaa tgg ttt 192 
Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 
50 55 60 

ccc gca gaa cct gaa gat gtt cgc gat tat ctt eta tat ctt cag gcg 240 
Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gin Ala 
65 70 75 80 



cgc ggt ctg gca gta aaa act ate cag caa cat ttg ggc cag eta aac 
Arg Gly Leu Ala Val Lys Thr lie Gin Gin His Leu Gly Gin Leu Asn 
85 90 95 



gtt tea ctg gtt atg egg egg ate cga aaa gaa aac gtt gat gec ggt 
Val Ser Leu Val Met Arg Arg lie Arg Lys Glu Asn Val Asp Ala Gly 
115 120 125 



att tec gtc tct ggt gta get gat gat ccg aat aac tac ctg ttt tgc 
He Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 
225 230 235 240 



288 



atg ctt cat cgt egg tec ggg ctg cca cga cca agt gac age aat get 336 
Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 
100 105 110 



384 



gaa cgt gca aaa cag get eta gcg ttc gaa cgc act gat ttc gac cag 432 
Glu Arg Ala Lys Gin Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gin 
130 135 140 

gtt cgt tea etc atg gaa aat age gat cgc tgc cag gat ata cgt aat 480 
Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gin Asp He Arg Asn 
145 150 155 160 

ctg gca ttt ctg ggg att get tat aac acc ctg tta cgt ata gee gaa 52 8 
Leu Ala Phe Leu Gly He Ala Tyr Asn Thr Leu Leu Arg He Ala Glu 
165 170 175 

att gee agg ate agg gtt aaa gat ate tea cgt act gac ggt ggg aga 576 
He Ala Arg He Arg Val Lys Asp He Ser Arg Thr Asp Gly Gly Arg 
180 185 190 

atg tta ate cat att ggc aga acg aaa acg ctg gtt age acc gca ggt 624 
Met Leu He His He Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 
195 200 205 

gta gag aag gca ctt age ctg ggg gta act aaa ctg gtc gag cga tgg 672 
Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 
210 215 220 



720 



egg gtc aga aaa aat ggt gtt gee gcg cca tct gee acc age cag eta 768 
Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gin Leu, 
245 250 255 

tea act cgc gee ctg gaa ggg att ttt gaa gca act cat cga ttg att 816 
Ser Thr Arg Ala Leu Glu Gly He Phe Glu Ala Thr His Arg Leu He 
260 265 270 

tac ggc get aag gat gac tct ggt cag aga tac ctg gee tgg tct gga 864 
Tyr Gly Ala Lys Asp Asp Ser Gly Gin Arg Tyr Leu Ala Trp Ser Gly 
275 280 285 

cac agt gee cgt gtc gga gee gcg cga gat atg gee cgc get gga gtt 912 
His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 
290 295 300 

tea ata ccg gag ate atg caa get ggt ggc tgg acc aat gta aat att 960 
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Ser lie Pro Glu lie Met Gin Ala Gly Gly Trp Thr Asn Val Asn lie 
305 310 315 320 

gtc atg aac tat ate cgt aac ctg gat agt gaa aca ggg gca atg gtg 1008 
Val Met Asn Tyr lie Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 
325 330 335 

cgc ctg ctg gaa gat ggc gat tag 1032 
Arg Leu Leu Glu Asp Gly Asp * 
340 

<210> 31 
<211> 343 
<212> PRT 

<213> Escherichia Coli 
<400> 31 

Met Ser Asn Leu Leu Thr Val His Gin Asn Leu Pro Ala Leu Pro Val 

15 10 15 

Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 

20 25 30 

Asp Arg Gin Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 

35 40 45 

Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 

50 55 60 

Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gin Ala 
65 70 75 80 

Arg Gly Leu Ala Val Lys Thr lie Gin Gin His Leu Gly Gin Leu Asn 

85 90 95 

Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 

100 105 110 

Val Ser Leu Val Met Arg Arg lie Arg Lys Glu Asn Val Asp Ala Gly 

115 120 125 

Glu Arg Ala Lys Gin Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gin 

130 135 140 

Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gin Asp lie Arg Asn 
145 150 155 160 

Leu Ala Phe Leu Gly lie Ala Tyr Asn Thr Leu Leu Arg lie Ala Glu 

165 170 175 

lie Ala Arg lie Arg Val Lys Asp He Ser Arg Thr Asp Gly Gly Arg 

180 185 190 

Met Leu He His He Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 

195 200 205 

Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 

210 215 220 

He Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 
225 230 235 240 

Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gin Leu 

245 250 255 

Ser Thr Arg Ala Leu Glu Gly He Phe Glu Ala Thr His Arg Leu He 

260 265 270 

Tyr Gly Ala Lys Asp Asp Ser Gly Gin Arg Tyr Leu Ala Trp Ser Gly 

275 280 285 

His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 

290 295 300 

Ser He Pro Glu He Met Gin Ala Gly Gly Trp Thr Asn Val Asn He 
305 310 315 320 

Val Met Asn Tyr He Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 

325 330 335 

Arg Leu Leu Glu Asp Gly Asp 
340 

<210> 32 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> attBl recognition sequence 
<400> 32 

tgaagcctgc ttttttatac taacttgagc gaa 33 

<210> 33 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> tn-att recognition sequence 

<221> misc_diff erence 
<222> 18 

<223> n is a or g or c or t/u 
<400> 33 

rkycwgcttt yktrtacnaa stsgb 25 

<210> 34 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attB recognition sequence 

<2 2 1 > mi s c_di f f erence 
<222> 18 

<223> n is a or c or g or t/u 
<400> 34 

agccwgcttt yktrtacnaa ctsgb 25 

<210> 35 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attR recognition sequence 

<221> misc_diff erence 
<222> 18 

<223> n is a or g or c or t/u 
<400> 35 

gttcagcttt cktrtacnaa ctsgb 25 

<210> 36 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attL recognition sequence 

<221> misc_diff erence 
<222> 18 

<223> n is a or g or c or t/u 
<400> 36 

agccwgcttt cktrtacnaa gtsgb 25 



<210> 37 
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<2ll> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attPl recognition sequence 

<221> misc_dif f erence 
<222> 18 

<223> n is a or g or c or t/u 
<400> 37 

gttcagcttt yktrtacnaa gtsgb 

<210> 38 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB2 recognition sequence 
<400> 38 

agcctgcttt cttgtacaaa cttgt 

<210> 39 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB3 recognition sequence 
<400> 39 

acccagcttt cttgtacaaa cttgt 

<210> 40 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attRl recognition sequence 
<400> 40 

gttcagcttt tttgtacaaa cttgt 

<210> 41 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR2 recognition sequence 
<400> 41 

gttcagcttt cttgtacaaa cttgt 

<210> 42 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR3 recognition sequence 



-37- 



25 



25 



25 



25 



<400> 42 
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gttcagcttt cttgtacaaa gttgg 

<210> 43 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attLl recognition sequence 
<400> 43 

agcctgcttt tttgtacaaa gttgg 

<210> 44 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attL2 recognition sequence 
<400> 44 

agcctgcttt cttgtacaaa gttgg 

<210> 45 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attL3 recognition sequence 
<400> 45 

acccagcttt cttgtacaaa gttgg 

<210> 46 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attPl recognition sequence 
<400> 46 

gttcagcttt tttgtacaaa gttgg 

<210> 47 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP2 # P3 recognition sequence 
<400> 47 

gttcagcttt cttgtacaaa gttgg 

<210> 48 
<211> 282 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP recognition sequence 



-38- 

25 



25 



25 



25 



25 



<400> 48 

ccttgcgcta atgctctgtt acaggtcact aataccatct aagtagttga ttcatagtga 60 
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ctgcatatgt tgtgttttac agtattatgt agtctgtttt ttatgcaaaa tctaatttaa 12 0 

tatattgata tttatatcat tttacgtttc tcgttcagct tttttatact aagttggcat 180 

tataaaaaag cattgcttat caatttgttg caacgaacag gtcactatca gtcaaaataa 240 

aatcattatt tgatttcaat tttgtcccac tccctgcctc tg 2 82 

<210> 49 
<211> 1071 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> nucleotide sequence encoding Integrase E174R 
<221> CDS 

<222> (1) . . . (1071) 
<223> Integrase E174R 



<400> 49 

atg gga aga agg cga agt cat gag cgc egg gat tta ccc cct aac ctt 

Met Gly Arg Arg Arg Ser His Glu Arg Arg Asp Leu Pro Pro Asn Leu 
15 10 15 



aaa gag ttt gga tta ggc aga gac agg cga ate gca ate act gaa get 
Lys Glu Phe Gly Leu Gly Arg Asp Arg Arg lie Ala lie Thr Glu Ala 
35 40 45 



gat get cca ctt gaa gac ate acc aca aaa gaa att gcg gca atg etc 

Asp Ala Pro Leu Glu Asp lie Thr Thr Lys Glu lie Ala Ala Met Leu 
115 120 125 

aat gga tac ata gac gag ggc aag gcg gcg tea gee aag tta ate aga 

Asn Gly Tyr lie Asp Glu Gly Lys Ala Ala Ser Ala Lys Leu lie Arg 

130 135 140 



aca aca aac cat gtc get gee act cgc gca gca aaa tct aga gta agg 
Thr Thr Asn His Val Ala Ala Thr Arg Ala Ala Lys Ser Arg Val Arg 
165 170 175 



48 



tat ata aga aac aat gga tat tac tgc tac agg gac cca agg acg ggt 96 
Tyr lie Arg Asn Asn Gly Tyr Tyr Cys Tyr Arg Asp Pro Arg Thr Gly 
20 25 30 



144 



ata cag gec aac att gag tta ttt tea gga cac aaa cac aag cct ctg 192 
lie Gin Ala Asn lie Glu Leu Phe Ser Gly His Lys His Lys Pro Leu 
50 55 60 

aca gcg aga ate aac agt gat aat tec gtt acg tta cat tea tgg ctt 240 
Thr Ala Arg He Asn Ser Asp Asn Ser Val Thr Leu His Ser Trp Leu 
65 70 75 80 

gat cgc tac gaa aaa ate ctg gec age aga gga ate aag cag aag aca 2 88 
Asp Arg Tyr Glu Lys He Leu Ala Ser Arg Gly He Lys Gin Lys Thr 
85 90 95 

etc ata aat tac atg age aaa att aaa gca ata agg agg ggt ctg cct 336 
Leu He Asn Tyr Met Ser Lys He Lys Ala He Arg Arg Gly Leu Pro 
100 105 110 



384 



432 



tea aca ctg age gat gca ttc cga gag gca ata get gaa ggc cat ata 4 80 
Ser Thr Leu Ser Asp Ala Phe Arg Glu Ala He Ala Glu Gly His He 
145 150 155 160 



528 



aga tea aga ctt acg get gac gaa tac ctg aaa att tat caa gca gca 576 
Arg Ser Arg Leu Thr Ala Asp Glu Tyr Leu Lys He Tyr Gin Ala Ala 
180 185 190 

gaa tea tea cca tgt tgg etc aga ctt gca atg gaa ctg get gtt gtt 624 
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Glu Ser Ser Pro Cys Trp Leu Arg Leu Ala Met Glu Leu Ala Val Val 
195 200 205 

acc ggg caa cga gtt ggt gat tta tgc gaa atg aag tgg tct gat ate 

Thr Gly Gin Arg Val Gly Asp Leu Cys Glu Met Lys Trp Ser Asp He 
210 215 220 

gta gat gga tat ctt tat gtc gag caa age aaa aca ggc gta aaa att 

Val Asp Gly Tyr Leu Tyr Val Glu Gin Ser Lys Thr Gly Val Lys He 

225 230 235 240 

gec ate cca aca gca ttg cat att gat get etc gga ata tea atg aag 

Ala He Pro Thr Ala Leu His He Asp Ala Leu Gly He Ser Met Lys 

245 250 255 

gaa aca ctt gat aaa tgc aaa gag att ctt ggc gga gaa acc ata att 

Glu Thr Leu Asp Lys Cys Lys Glu He Leu Gly Gly Glu Thr He He 
260 265 270 

gca tct act cgt cgc gaa ccg ctt tea tec ggc aca gta tea agg tat 

Ala Ser Thr Arg Arg Glu Pro Leu Ser Ser Gly Thr Val Ser Arg Tyr 
275 280 285 



cct acc 


ttt 


cac 


gag 


ttg 


cgc 


agt 


ttg 


tct gca aga 


etc 


cac 


gag 


aag 


Pro Thr 


Phe 


His 


Glu 


Leu 


Arg 


Ser 


Leu 


Ser Ala Arg 


Leu 


Tyr 


Glu 


Lys 


305 








310 






315 








320 


cag ata 


age 


gat 


aag 


ttt 


get 


caa 


cat 


ctt etc ggg 


cat 


aag 


teg 


gac 


Gin He 


Ser 


Asp 


Lys 


Phe 


Ala 


Gin 


His 


Leu Leu Gly 


His 


Lys 


Ser 


Asp 






325 










330 






335 




acc atg 


gca 


tea 


cag 


tat 


cgt 


gat 


gac 


aga ggc agg 


gag 


tgg 


gac 


aaa 


Thr Met 


Ala 


Ser 


Gin 


Tyr 


Arg 


Asp Asp Arg Gly Arg 


Glu 


Trp 


Asp 


Lys 






340 










345 






350 






att gaa 


ate 


aaa 


taa 




















He Glu 


He 


Lys 


* 






















355 
























<210> 50 
























<211> 356 
























<212> PRT 
























<213> Artificial Sequence 
















<220> 


























<223> Integrase 


E174R 


















<400> 50 






















Leu 


Met Gly 


Arg 


Arg 


Arg 


Ser 


His 


Glu 


Arg 


Arg Asp Leu 


Pro 


Pro 


Asn 


1 




5 










10 






15 




Tyr He 


Arg 


Asn 


Asn 


Gly 


Tyr 


Tyr Cys 


Tyr Arg Asp 


Pro 


Arg 


Thr Gly 


20 










25 






30 






Lys Glu 


Phe 


Gly 


Leu 


Gly 


Arg 


Asp 


Arg 


Arg He Ala 


He 


Thr 


Glu 


Ala 


35 










40 






45 








He Gin 


Ala 


Asn 


He 


Glu 


Leu 


Phe 


Ser Gly His Lys 


His 


Lys 


Pro 


Leu 


50 










55 






60 










Thr Ala 


Arg 


He 


Asn 


Ser 


Asp 


Asn 


Ser 


Val Thr Leu 


His 


Ser 


Trp 


Leu 


65 






70 






75 








80 


Asp Arg 


Tyr 


Glu 


Lys 


He 


Leu 


Ala 


Ser 


Arg Gly He 


Lys 


Gin 


Lys 


Thr 




85 










90 






95 




Leu He 


Asn 


Tyr 


Met 


Ser 


Lys 


He 


Lys 


Ala He Arg 


Arg 


Gly 


Leu 


Pro 






100 










105 






110 






Asp Ala 


Pro 


Leu 


Glu 


Asp 


He 


Thr 


Thr 


Lys Glu He 


Ala 


Ala 


Met 


Leu 



672 



720 



768 



816 



864 



ttt atg cgc gca cga aaa gca tea ggt ctt tec ttc gaa ggg gat ccg 912 
Phe Met Arg Ala Arg Lys Ala Ser Gly Leu Ser Phe Glu Gly Asp Pro 
290 295 300 



960 



1008 



1056 



1071 
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115 










120 


Asn 


Gly 
130 


Tyr 


He 


Asp 


Glu 


Gly 
135 


Lys 


Ser 


Thr 


Leu 


Ser 


Asp 


Ala 


Phe 


Arg 


145 










150 






Thr 


Thr 


Asn 


His 


Val 
165 


Ala 


Ala 


Thr 


Arg 


Ser 


Arg 


Leu 
180 


Thr 


Ala 


Asp 


Glu 


Glu 


Ser 


Ser 
195 


Pro 


Cys 


Trp 


Leu 


Arg 
200 


Thr 


Gly 
210 


Gin 


Arg 


Val 


Gly 


Asp 
215 


Leu 


Val 


Asp 


Gly 


Tyr 


Leu 


Tyr 


Val 


Glu 


225 










230 






Ala 


He 


Pro 


Thr 


Ala 
245 


Leu 


His 


He 


Glu 


Thr 


Leu 


Asp 
260 


Lys 


Cys 


Lys 


Glu 


Ala 


Ser 


Thr 


Arg 


Arg 


Glu 


Pro 


Leu 






275 








280 


Phe 


Met 


Arg 


Ala 


Arg 


Lys 


Ala 


Ser 




290 








295 




Pro 


Thr 


Phe 


His 


Glu 


Leu 


Arg 


Ser 


305 










310 






Gin 


He 


Ser 


Asp 


Lys 
325 


Phe 


Ala 


Gin 


Thr 


Met 


Ala 


Ser 
340 


Gin 


Tyr 


Arg 


Asp 


He 


Glu 


He 
355 


Lys 











<210> 51 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Lox P Site 
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125 








Ala 


Ala 


Ser 


Ala 
140 


Lys 


Leu 


He 


Arg 


Glu 


Ala 


He 


Ala 


Glu Gly 


His 


He 






155 










160 


Arg 


Ala 


Ala 


Lys 


Ser 


Arg 


Val 


Arg 


170 










175 




Tyr 


Leu 


Lys 


He 


Tyr 


Gin 


Ala 


Ala 


185 










190 






Leu 


Ala 


Met 


Glu 


Leu 
205 


Ala 


Val 


Val 


Cys 


Glu 


Met 


Lys 
220 


Trp 


Ser 


Asp 


He 


Gin 


Ser 


Lys 


Thr 


Gly Val 


Lys 


He 






235 










240 


Asp 


Ala 
250 


Leu 


Gly 


He 


Ser 


Met 
255 


Lys 


He 


Leu Gly Gly 


Glu 


Thr 


He 


He 


265 










270 






Ser 


Ser Gly Thr 


Val 


Ser 


Arg 


Tyr 










285 








Gly 


Leu 


Ser 


Phe 


Glu Gly 


Asp 


Pro 






300 










Leu 


Ser 


Ala 
315 


Arg 


Leu 


Tyr 


Glu 


Lys 
320 


His 


Leu 


Leu 


Gly 


His 


Lys 


Ser 


Asp 




330 








335 




Asp 


Arg Gly Arg 


Glu 


Trp 


Asp 


Lys 


345 










350 







<400> 51 

ataacttcgt ataatgtatg ctatacgaag ttat 



34 
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